Conversation

Jonathan Corbet

Today I got a cheery email from somebody who claims to be the "ethics and compliance" officer for a company called Bright Data. He wanted to have a "no pressure" conversation about the whole AI scraperbot problem. Looking at their web site, this company offers an API that, and I quote, "Bypasses anti-scraping mechanisms and solves CAPTCHAs, ensuring uninterrupted access to the most protected web sites".

After careful consideration for several milliseconds, I have concluded that I really don't have anything to discuss with this person.

But at least their claimed "100M+" of residential IP addresses that they use for their DDOS attacks are "ethically sourced".
16
128
149

@corbet book a meeting. Burn an hour of his time. Burn as much as you can stand. Only then explain to him the various dimension with which he is wanting life.

0
0
0

@corbet Imagine getting such a mail about a business helping us to bypass Disney's copyright walls or similar.

That'd be a very short lived company.

The whole GenAI scrapers need to be regulated and sued out of business 😐

0
0
0

@corbet ah protection racket. "We sell the DDOS service to AI companies. but for a tiny sum we'll exclude you"

2
0
0

@corbet let them call you, over the phone so they get billed, but connect them to an AI chatbot to get them busy (and billed) for as long as possible

0
0
1

@corbet you should ask him for a list of IPs “before you buy”!

and then accidentally share it publicly :^)

1
0
1

@corbet I'm surprised that you passed on that. In your shoes I'd have had some choice words for them. And it might have made for a good article... shining more light on their practices and putting some names out there might just have a tiny impact on their behavior. Maybe.

2
0
0

@corbet what if he wanted to leak something, or explain how guilty he feels for working for such company and ask how to proceed with life?

0
0
0

@jzb @corbet As far as I can tell, in their business "ethical sourcing" means that when they pay an app developer to include their scraping SDK, they require that the app's privacy policy include a disclosure that it does scraping along with whatever else it does https://brightdata.com/trustcenter/sourcing

1
0
0
@jzb There is the old maxim about mud wrestling with pigs — you just get muddy and the pig enjoys it. These people have taken enough of my time as it is, and I doubt I have anything to tell them they haven't heard before.
1
2
19
@corbet it's chinese bullshit, doesn't work tbh
0
0
0

@corbet "ensuring uninterrupted access to the most protected web sites": if they had any ethics, perhaps they would offer to pay hosting costs for the affected web sites ?

1
0
0

@corbet Maybe LWN should introduce an AI Scraper Bot subscription level you can point ethics people to. 🤔

1
0
0

@corbet @monsieuricon "ethically sourced" aka "they pay for a 'VPN', and the 'contract' states that we use their IPs for the endpoints".... what people do to circumvent geoIP blocks to access a different geo/part of a service they pay for but that does not offer a thing in their geo thus they have to "VPN" around it and help "companies" like this.... sigh....

0
0
0

@fenruspdx @corbet lovely website you have there, it'd be a shame if something happened to it...

0
0
0
@fenruspdx He actually had the gall to write back to me and, after some sanctimonious bullshit about keeping publicly available data available, offered: "If you can have both visibility and control about any bot coming to your domain, and the option to set sensitive end points,
wouldn't that be something worth exploring?"

So yes, you were right. They are selling protection schemes as a side gig.
0
7
10

@corbet @jzb We're not going to stop Bright Data from doing this, but we should be making some noise to some of their highlighted customers. @mozillaofficial is listed right there. As is the United Nations and the University of Oxford. Does supporting a company that's DDOS'ing open source projects fit with the mission of those organisations?

0
5
0

@corbet One possibility is that this company places ads on various web sites, with the ads using javascript to make HTTP requests to the websites they are targeting. The people whose computers are using those IP addresses would then not have any idea that this was happening. They may depend on using browsers without the latest security features.

1
0
0

@corbet

Bet the email contains a Tracking Pixel.

0
0
0

@corbet I would. And then let the bashing begin!

0
0
0

@bzdev @corbet Bright Data runs a "free" VPN service where you offer your own desktop as an exit node for other users and then you can use their network as your exit node, that's the trade and many people willingly make that deal, their other business is selling access to the VPN network for AI scraper bots. So it works like a botnet / Tor but with the willing participation of the nodes, so it is more ethical than a botnet, they aren't wrong.

0
0
0

@corbet Love the millisecond bit about consideration. Perhaps it should've been in the usec range though. We need to optimize those cycles, it was wasted time. But by all means, F these people.

0
0
0

@corbet good call - Bright Data is one of the largest residential proxy services which gets (at least part of) its operational service base by literally having users everywhere provide it for free: Bright VPN

0
0
0

@corbet Lol, the .com is blocked in my pihole 😆

0
0
0