Posts
328
Following
28
Followers
1628

Jonathan Corbet

Forbes is warning us that Android phones are under severe risk due to a kernel vulnerability:

https://www.forbes.com/sites/zakdoffman/2025/02/03/google-warns-all-android-users-your-phone-is-now-at-risk/

This comes from Google's Android security bulletin for February:

https://source.android.com/docs/security/bulletin/2025-02-01

...which informs us that "There are indications that CVE-2024-53104 may be under limited, targeted exploitation". The vulnerability in question, though, is CVE-2024-53104:

https://lwn.net/ml/all/2024120232-CVE-2024-53104-d781@gregkh

...which is in the uvcvideo camera driver. Either I'm missing something badly, or the only way to exploit this would be to plug a malicious camera device into the phone. I can see why they would want to fix this, but I'm not sure it's a red-alert situation for most of us?
3
13
19

Jonathan Corbet

Goblin Valley is also worth a visit!
3
1
18

Jonathan Corbet

A week ago we managed to get away for a few days to Capitol Reef National Park — definitely worth exploring. It's important to escape to a beautiful place with no network service every now and then.
0
1
23
@keira_reckons Honestly I don't think it's just a matter of "people who make poor decisions". We have created a world where thousands of predatory people have free rein to try to rip us off every day, and some of them are good at it. I'm pretty aware of such things (I think), but it still feels like it's only a matter of time until I have a bad day and get scammed somehow.
0
0
3
@mcdanlj @LWN What a lot of people are suggesting (nepethenes and such) will work great against a single abusive robot. None of it will help much when tens of thousands of sites are grabbing a few URLs each. Most of them will never step into the honeypot, and the ones that do will not be seen again regardless.
1
0
2
@penguin42 They don't tell me what they are doing with the data... the distributed scraping is an easily observable fact, though. Perhaps they are firehosing the data back to the mothership for training?
1
0
0
@smxi @monsieuricon Suggestions for these countermeasures - and how to apply them without hosing legitimate users - would be much appreciated. I'm glad they are obvious to you, please do share!
1
0
8

Jonathan Corbet

So I guess I'm famous now :)

https://www.heise.de/en/news/AI-bots-paralyze-Linux-news-site-and-others-10252162.html

To be clear, LWN has never "crashed" as a result of this onslaught. We'll not talk about what happened after I pushed up some code trying to address it...

Most seriously, though: I'm surprised that this situation is surprising to anybody at this point. This is a net-wide problem, it surely is not limited to free-software-oriented sites. But if the problem is starting to get wider attention, that is fine with me...
3
32
54

Jonathan Corbet

A followup for folks who are curious about the whole AI botswarm problem...

Some of these bots are clearly running on a bunch of machines on the same net. I have been able to reduce the traffic significantly by treating everything as a class-C net and doing subnet-level throttling. That and simply blocking a couple of them.

But that leaves a lot of traffic with an interesting characteristic: there are millions of obvious bot hits (following a pattern through the site, for example) that all come from a different IP. An access log with 9M lines as over 1M IP addresses, and few of them appear more than about three times.

So these things are running on widely distributed botnets, likely on compromised computers, and they are doing their best to evade any sort of recognition or throttling. I don't think that any sort of throttling or database of known-bot IPs is going to help here...not quite sure what to do about it.

What a world we have made for ourselves...
11
44
51
@daniel @LWN The problem with restricting reading to logged-in people is that it will surely interfere with our long-term goal to have the entire world reading LWN. We really don't want to put roadblocks in front of the people we are trying to reach.
0
0
3
@DamonHD @kevin So how does Enphase cut off access to a local resource like that? Have they said why such a thing would happen?
1
0
1
@AndresFreundTec @LWN Yes, a lot of really silly traffic. About 1/3 of it results in redirects from bots hitting port 80; you don't see them coming back with TLS, they just keep pounding their head against the same wall.

It is weird; somebody has clearly put some thought into creating a distributed source of traffic that avoid tripping the per-IP circuit breakers. But the rest of it is brainless.
3
0
3
@RonnyAdsetts @LWN The user agent field is pure fiction for most of this traffic.
0
0
2
@adelie @LWN Blocking a subnet is not hard; the harder part is figuring out *which* subnets without just blocking huge parts of the net as a whole.
2
0
1
@gme I assume you're referring to https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click/ ?

It would appear to force readers to enable JavaScript, which we don't want to do. Plus it requires running all of our readers through cloudflare, of course...and I suspect that the "free tier" is designed to exclude sites like ours. So probably not a solution for us, but it could well work for others.
1
0
2
@bignose @LWN We have gone far out of our way to never require JavaScript to read LWN; we're not going back on that now.
0
2
11
@johnefrancis @LWN Something like nepenthes (https://zadzmo.org/code/nepenthes/) has crossed my mind; it has its own risks, though. We had a suggestion internally to detect bots and only feed them text suggesting that the solution to every world problem is to buy a subscription to LWN. Tempting.
5
3
36
@beasts @LWN We are indeed seeing that sort of pattern; each IP stays below the thresholds for our existing circuit breakers, but the overload is overwhelming. Any kind of active defense is going to have to figure out how to block subnets rather than individual addresses, and even that may not do the trick.
3
1
3

Jonathan Corbet

Should you be wondering why @LWN #LWN is occasionally sluggish... since the new year, the DDOS onslaughts from AI-scraper bots has picked up considerably. Only a small fraction of our traffic is serving actual human readers at this point. At times, some bot decides to hit us from hundreds of IP addresses at once, clogging the works. They don't identify themselves as bots, and robots.txt is the only thing they *don't* read off the site.

This is beyond unsustainable. We are going to have to put time into deploying some sort of active defenses just to keep the site online. I think I'd even rather be writing about accounting systems than dealing with this crap. And it's not just us, of course; this behavior is going to wreck the net even more than it's already wrecked.

Happy new year :)
45
447
355
Show older