Conversation

I'm having trouble figuring out what kind of botnet has been hammering our web servers over the past week. Requests come in from tens of thousands of addresses, just once or twice each (and not getting blocked by fail2ban), with different browser strings (Chrome versions ranging from 24.0.1292.0 - 108.0.5163.147) and ridiculous cobbled-together paths like /about-us/1-2-3-to-the-zoo/the-tiny-seed/10-little-rubber-ducks/1-2-3-to-the-zoo/the-tiny-seed/the-nonsense-show/slowly-slowly-slowly-said-the-sloth/the-boastful-fisherman/the-boastful-fisherman/brown-bear-brown-bear-what-do-you-see/the-boastful-fisherman/brown-bear-brown-bear-what-do-you-see/brown-bear-brown-bear-what-do-you-see/pancakes-pancakes/pancakes-pancakes/the-tiny-seed/pancakes-pancakes/pancakes-pancakes/slowly-slowly-slowly-said-the-sloth/the-tiny-seed

(I just put together a bunch of Eric Carle titles as an example. The actual paths are pasted together from valid paths on our server but in invalid order, with as many as 32 subdirectories.)

Has anyone else been seeing this and do you have an idea what's behind it?

1
3
0
@linuxandyarn Welcome to the world of AI scraper bots ... https://lwn.net/Articles/1008897/

Looking at the web page of a company called "Bright Data" is informative too.
0
0
0

@jwildeboer I wondered, but since they're not being as "friendly" as ClaudeBot or PetalBot by identifying themselves they've been much harder to manage. I also thought a malicious browser plugin could be involved.

0
0
0

@jwildeboer If they're mobile apps then presumably most of them will be behind CGNAT so even one device on an ASN will likely seem to have multiple IPv4 addresses (e.g., the 4 per ASN you've seen).

Also, that might be why they tend not to use v6 as a device would, presumably, have a stable address for at least a few hours.

@linuxandyarn

0
0
0