Conversation
Edited 2 days ago
Some botfarm is aggressively crawling lore.kernel.org pretending to be b4 in the user-agent, except that real b4 has a very distinct usage pattern, so it's easy for me to recognize and ban them.
4
4
16

@monsieuricon I find it extra irritating that it's not just some generic bot pretending to be a browser. They're actually aiming at lore.kernel.org specifically judging by the user agent :/

1
0
0
@forst Yes, and the most annoying part is that if they want to train their stupid AI whatever, they can just clone the underlying repositories instead of hitting us for every URL.
1
2
4

@monsieuricon They must've asked AI about how to crawl lore.kernel.org without being banned

0
0
1
@monsieuricon Thanks for dealing with that. I love b4 and lore!
0
0
1

@monsieuricon *yelling at the bot operators through cupped hands* CLONE THE REPOS, NERDS

0
0
0

@monsieuricon i think all public email archives should remove mail adresses and possibly names so info can be indexed but not connected to the authors unless you participate in the mailing list. Same is true for bug trackers like Bugzilla.

1
0
0
@taketwo I see where you're coming from, but that invalidates cryptographic signatures and breaks end-to-end attestation.
0
0
0