RE: https://en.osm.town/@osm_tech/116052113368747355
Had a very smart and serious tech policy person ask me last week if scraping was really as disruptive as some of the lawsuits about it say.
@luis_in_brief We're faced with 1) Investing yet more time to come up with fingerprint protections to stop the abusive site scraping (which are side stepping all previous traditional controls; like robots.txt and individual IP address blocks) or 2) Buy additional hardware resources (at over inflated prices due to RAM/Flash pricing balloon).
@luis_in_brief Historically we've been able to run OpenStreetMap very lean: a small volunteer sysadmin/SRE team and minimal, low-cost refurbished hardware. Both are now being pushed to breaking point. And most annoyingly, the scrapes are trying to get data we ALREADY PUBLISH as complete weekly compressed downloads on planet.osm.org
@osm_tech someone from WMF told me that they have scrapers from the same IP downloading their *monthly* compressed file every *minute*. The scrapers, besides being malicious, are also often very dumb.
I am not sure how you draw a policy line here that protects the LWNs/OSMs/WMFs of the world without making it impossible to do socially-beneficial extraction of data from our data monopolist overlords. But someone needs to be thinking about that, very hard.
https://social.kernel.org/objects/97b630d4-c31a-4596-9821-6beca6363857
RE: https://indieweb.social/@tchambers/116053050990802350
as if on cue: