social.kernel.org

Conversation

Luis Villa

Edited 4 months ago

RE: https://en.osm.town/@osm_tech/116052113368747355

Had a very smart and serious tech policy person ask me last week if scraping was really as disruptive as some of the lawsuits about it say.

OpenStreetMap Ops Team

osm_tech@en.osm.town

4 months ago

Reply to @luis_in_brief@social.coop

@luis_in_brief We're faced with 1) Investing yet more time to come up with fingerprint protections to stop the abusive site scraping (which are side stepping all previous traditional controls; like robots.txt and individual IP address blocks) or 2) Buy additional hardware resources (at over inflated prices due to RAM/Flash pricing balloon).

OpenStreetMap Ops Team

osm_tech@en.osm.town

4 months ago

Reply to @osm_tech@en.osm.town

@luis_in_brief Historically we've been able to run OpenStreetMap very lean: a small volunteer sysadmin/SRE team and minimal, low-cost refurbished hardware. Both are now being pushed to breaking point. And most annoyingly, the scrapes are trying to get data we ALREADY PUBLISH as complete weekly compressed downloads on planet.osm.org

Luis Villa

luis_in_brief@social.coop

4 months ago

Reply to @osm_tech@en.osm.town

@osm_tech someone from WMF told me that they have scrapers from the same IP downloading their *monthly* compressed file every *minute*. The scrapers, besides being malicious, are also often very dumb.

Jonathan Corbet

corbet

4 months ago

Reply to @luis_in_brief@social.coop

@luis_in_brief @osm_tech We (LWN) have seen attacks from over one-million IP addresses over the course of a few hours, repeatedly downloading stuff we published 20 years ago. Who knows, maybe somebody will go back in time and change Darl McBride's mind, so you gotta keep checking...

Luis Villa

luis_in_brief@social.coop

4 months ago

Reply to @luis_in_brief@social.coop

I am not sure how you draw a policy line here that protects the LWNs/OSMs/WMFs of the world without making it impossible to do socially-beneficial extraction of data from our data monopolist overlords. But someone needs to be thinking about that, very hard.
https://social.kernel.org/objects/97b630d4-c31a-4596-9821-6beca6363857

Luis Villa

luis_in_brief@social.coop

4 months ago

Reply to @luis_in_brief@social.coop

RE: https://indieweb.social/@tchambers/116053050990802350

as if on cue:

About social.kernel.org

Terms of service

Please do not use this service in violation of the Linux Kernel Code of Conduct. Doing so will result in your account suspension with the referral of the matter to the CoC committee.
"Repeating"/"boosting" someone else's status on this platform will be treated as endorsement and will fall under rule #1.
You are encouraged to use this platform to promote your work on the Linux Kernel, but there is no restriction on permitted topics (with the exception of anything covered by #1 above).
There is no requirement to post in English, but it should be considered the primary language of communication on this platform.

Privacy notice

The admins of this service have access to all posted statuses. They aren't looking, but if it's something they shouldn't know about, then you should not post it on this platform.

Please see the Linux Foundation Privacy Policy, which applies to this platform as well.

Getting your own account

If you would like an account on this instance, please check that the following applies to you:

You are listed in MAINTAINERS or CREDITS
OR: You have a kernel.org account or email address
OR: You have a long and established history of involvement with the Linux Kernel

If the above is true and you agree with the Terms of Service and Privacy Notice listed above, please use these instructions to request an account:

How to request an account on social.kernel.org