I am still on Vercel (yes I know, trying to migrate off...) and it gives you automated alerts when there are anomalous traffic spikes. Funnily enough, I have had about 10 scrapers from various places scraping the site in the last week.
Or want to get absolutely ripped off by the non-government sellers that are somehow allowed on those platforms. The whole point of Govdeals et al is that the seller is a known-ish quantity. If I wanted to roll the dice on garbage with fresh paint I'd be on Ritchie bros.
Govdeals managed services (or whatever they call it now) is just as questionable as 3rd party sellers on any given big bog store's ecommerce "platform".
I'm curious how much of this stuff is actually civil asset forfeiture? (Not to blame the site(s) for such practices, but to think about the whole ecosystem of how a government comes to have a bicycle, switch, truck etc)
I almost bought a lighthouse 25 years ago off of a GSA auction. I'm glad my bid lost because I didn't read the fine print carefully about how much the upkeep would cost.
Cool idea, I tapped the vehicles / heavy equipment tabs expecting to be taken to listings. Nothing happened. Maybe these should all take you to a page that lets you sign up to see listings?
I work in the HR space and though about something similar with jobs. A bunch of recuiters will publish the same job. What I figured would be best is have one job entry and show all the recruiters who published and let you go to one of them.
First thing I searched was Pokemon cards and found items with bids at 50% higher than market value...either shill bidding or folks who are bidding blindly.
US government auctions are scattered across at least 28 platforms. GSA sells decommissioned federal fleet. DLA Disposition moves military gear. The US Marshals front seized property through bid4assets. PublicSurplus runs school district and state-agency lots. GovDeals fronts thousands of county and municipal agencies. Fannie Mae and HUD auction foreclosed homes. None of these sites index together, and most have search UX that lost a fight with 2008.
So I scraped them all and put one search box in front. 180,276 active listings as of today, normalized into a shared schema in Postgres with full-text search. About 53,000 new listings come in every week.
A few real things you can buy this week, all live in the data:
The work that took longest wasn't the scraping (each source has its own quirky JSON or HTML), it was the dedup. The same Fannie Mae foreclosure shows up under three different addresses across three platforms. A "2008 Ford F-150" from GSA Fleet looks structurally identical to one from PublicSurplus, but they're different vehicles with different VINs, and the only way to know is to fingerprint enough metadata to make a
confident match.
There's a deal score per listing (price vs category median, bid velocity, time remaining, starting-bid ratio) and SEO landing pages per state-by-category combo, mostly because long-tail government-auction queries on Google are nearly all unanswered.
Stack: Next.js, Postgres, TypeScript scrapers per source, daily refresh.
Happy to answer questions about scraping the federal sites (some of them really do not want to be scraped) or how the deal scoring works.
I hope you're adding some fictitious entries (https://en.wikipedia.org/wiki/Fictitious_entry) to track where those scrapes might be going.
Govdeals managed services (or whatever they call it now) is just as questionable as 3rd party sellers on any given big bog store's ecommerce "platform".
I didn't realize the government could become Amazon.
Mil-spec transport containers? Excellent.
Mil-spec rucksacks? Not so excellent.
https://reason.com/category/criminal-justice/civil-asset-for...
> Error: Invalid frameId for foreground frameId: 0
on Chrome 147.0.7727.102
The server is under heavy load. Please try again in a moment. "
So I scraped them all and put one search box in front. 180,276 active listings as of today, normalized into a shared schema in Postgres with full-text search. About 53,000 new listings come in every week.
A few real things you can buy this week, all live in the data:
- A 2000 Bell 430 helicopter (executive model), $250k starting, 0 bids: https://www.govdeals.com/asset/8103/23762
- A 1985 Cessna 182R aircraft in Missouri, $33k starting, 0 bids: https://www.govdeals.com/asset/36476/430
- An M75 APC armored personnel carrier on Ritchie Bros, no bids yet: https://www.rbauction.com/pdp/armored-tank-m75-apc-personnel...
- A Rolls-Royce ship thruster, never used, $500k starting: https://www.govdeals.com/asset/247/16144
- A 2.3 kg iridium-platinum ingot (police seizure on PropertyRoom), 52 bids, currently $175k: https://www.propertyroom.com/l/iridium-platinum-ingot-ir90-p...
- A 1927 Seagrave fire truck, "runs, drives, and titled," $24k, 0 bids: https://www.govdeals.com/asset/285/16223
- A truck-mounted forklift from a manufacturer literally named "Donkey & Burro": https://www.govplanet.com/for-sale/Forklifts/14842632
The work that took longest wasn't the scraping (each source has its own quirky JSON or HTML), it was the dedup. The same Fannie Mae foreclosure shows up under three different addresses across three platforms. A "2008 Ford F-150" from GSA Fleet looks structurally identical to one from PublicSurplus, but they're different vehicles with different VINs, and the only way to know is to fingerprint enough metadata to make a confident match.
There's a deal score per listing (price vs category median, bid velocity, time remaining, starting-bid ratio) and SEO landing pages per state-by-category combo, mostly because long-tail government-auction queries on Google are nearly all unanswered.
Stack: Next.js, Postgres, TypeScript scrapers per source, daily refresh.
Happy to answer questions about scraping the federal sites (some of them really do not want to be scraped) or how the deal scoring works.
I wish I had jumped on those offers back when they were in the back of Boys Life, Popular Mechanics, and SOF magazines back in the day.