Conversation

"we caught one" god. Luxurious.

Once again, I strongly advise you to set up fail2ban so that anyone you serve a 404 catches at least a full day ban, and if you don't care about talking to other people's services, do your best to fully block the IP ranges associated with all the major hosting companies.

https://exple.tive.org/blarg/2025/10/21/raised-shields/

https://infosec.exchange/@foobardevs/116246141464905287

3
0
0

@mhoye a single 404 is very aggressive, all it takes is one fatfingered link in a page pointing to your site and you're banning visitors en masse.

after 5-10 404s in a row for different URLs, different story.

1
0
0

@azonenberg Nobody is fat-fingering their way to .env or backdoor .asp files.

2
0
0

@mhoye yes i'm all for having specific poison URLs that trigger an immediate ban.

But "immediate ban on any 404 whatsoever" seems heavy handed.

1
0
0

@azonenberg Read your logs, tell me what you see. Nobody's typing out URLs anymore.

1
0
0

@mhoye @azonenberg .... we do personally do that

we're prepared to accept that we do not exist, in a statistical sense. that is true in SO many ways

but we hesitate to put into place a rule which would lock ourselves out. we'd at least apply a threshold of a few 404s over a window of time, not just a single one

2
0
0

@ireneista @mhoye exactly, i've copy pasted a link and truncated it or something so many times while making another blog page or sending an email etc.

URLs get truncated in emails by 80 column format limits all the time too.

If you ban on a single 404, or repeated hits to a single truncated/corrupted URL, you'll cut off a lot of legitimate users. Specific bogus URLs, e.g. .asp and .php URLs when your site is written in python, make a lot more sense as poison ban URLs.

As someone who is currently unable to order delivery from my local grocery store due to excessively paranoid WAFs, I'm strongly against this kind of hair trigger defense mechanism.

I've stopped buying components from Mouser because they lock me out constantly for reasons unknown, doing a single search for a component part number is often enough to trigger it.

1
1
0

@azonenberg @mhoye oh, yeah, we intentionally browse the web in ways that deny sites most of the secondary signals they use to assess human-ness, because we think it's none of anyone's business and our privacy background makes us highly aware of all the other things that data can be used for....... so we get locked out of pretty much everything, constantly, and we have a lot of habits around getting un-locked-out as part of our browsing experience. sigh.

1
1
0

@azonenberg @mhoye we have noticed that mouser is a particular offender, indeed, right up there with the less-regulated financial institutions (the various money transfer services that do the things banks do but avoid being treated as banks by the law) in regard to its hair-trigger nature

mouser does seem to take a much narrower list of signals into account than the money transfer services do, which makes it easier to defeat. so that's nice?

1
0
0

@azonenberg @mhoye but yeah, we buy from digikey, it's easier and we like their filter UI better anyhow

1
0
0

@ireneista @mhoye exactly I've just stopped giving mouser my business over this. You go out of my way to not get my business, I'll honor your wishes

0
1
0

@mhoye Like others have said, please don't. This screws over real ppl (note: blocking "hosting" IP ranges blocks folks with DIY VPNs! This will matter even more if countries start banning commercial VPN services!) and isn't needed to mitigate scraper or vuln scanner hammering. Just heavily throttle addresses as soon as they perform suspicion accesses.

0
0
0

@mhoye so if i follow a link to your site, and the URL is no longer valid, I get banned for a day?

That's certainly an effective way to make nobody ever visit your site again.

1
0
0

@mhoye @azonenberg another thing we've seen in logs recently is automated requests ie. for favicons that never existed, or for stylesheets that have moved, which happen as part of every successful pageload and which generate 404s. so there needs to be some way of dealing with that, too

1
0
0

@ireneista @mhoye @azonenberg I'm seeing bots guessing URLs that look like something I might have written, too. There is no `/articles/removing-trackers/floc-affinity` on my site. Maybe generating broken links for a user who asked for a summary?

And there are always a lot of 404s for stuff like `/page-title/favicon.ico` after my site gets on "Hacker News" -- so many crawler scripts (at varying levels of working) get links from there

1
0
0

@dmarti @ireneista @azonenberg I've created favicons despite never caring about that at all, because so many clients that appear to be legitimate humans doing legitimate human things reach for them automatically, Including a bunch of feed readers and aggregators.

1
0
0
@mhoye @azonenberg That part is true. I would never ban an address just for a 404 (though we do track such things in other ways). But if somebody is going for, say, /wp-anything, that's not a typo, that is seeing if a random doorknob is unlocked. Not the sort of reader we are writing for.
1
1
1

@corbet @mhoye yeah exactly what i'm getting at... there's a huge difference between "truncated url to an article" or "mistyped image URL" and looking for /wp-admin on my static HTML site.

That said, it's a static site you can look for wordpress endpoints all day nothing will talk back to you :p

0
0
0

@mhoye @ireneista @azonenberg Yes, I also made them. Found some good sources of freely licensed images and a handy favicon maker site. https://blog.zgp.org/favicon/

0
0
0