I'm requesting that my questions and answers be permanently deleted under GDPR.
It's just a reminder that anything you post on any of these platforms can and will be used for profit. It's just a matter of time until all your messages on Discord, Twitter etc. are scraped, fed into a model and sold back to you.
@ben Stack Overflow has already been monetizing your answers with ads for years. If “used for profit” is your main complaint, you’re a little late.
@mighty_orbot @ben @mighty_orbot @ben The argument isn't about profit, which is pretty clearly outlined. OpenAI's explicit and ultimate intent is to replace people and in the meantime it's spitting out garbage information.
@andrewfelix @mighty_orbot @ben
And their software is laundering the original source of the information from which their AI training data was derived. Doesn't the original author deserve some credit for when ChatGPT regurgitates a lossy paraphrasing of a post scraped from the Internet?
@ben They’re not yours, they’re theirs. Jeff Atwood thanks you for your free labour. (I’m kidding, he doesn’t. Feel grateful he even allowed you to contribute in the first place, serf.)
Speaking of Jeff Atwood, isn’t he the guy helping fund Mastodon now? 🤔
#SiliconValley #PeopleFarming #JeffAtwood #surveillance #capitalism #AllYourDataAreBelongToUs
@Orb2069 @felipe @ben This is something I've thought about ever since I started here. It's great that people here take their time to make the web better for disabled people.
But unfortunately, high-quality image descriptions are a gift to AI companies training text-to-image models. There is no act of altruism these assholes will not exploit.
@datarama @Orb2069 @felipe @ben
In a perfect world, AI could be used to describe images to vision impaired people.
The real wrong isn't the AI itself, but that its owners use it only for selfish gains.
Kind of like GMOs, we could use them to feed more people for less but Monsanto only uses them to gouge farmers.
@bornach No you should not. This unfortunately is really a situation where doing the right thing is self-sabotage.
You can poinson the image using nightshade, but I would not count on it's effectiveness.
@Phosphenes @datarama @felipe @ben
In a perfect world, AI wouldn't "hallucinate" (PR spin/flavor on just being wrong ), and might be useful for that sort of thing.
(Btw: Meta already does this, but their alt tags consist of something like " Image may contain <object>, <text>, <object>" - the data exists because they have to run image analysis for automated moderation anyways - they surface it because it satisfies ADA requirements )
@Phosphenes @datarama @felipe @ben
Eh? I mean, it's not super useful. Forex: "image contains woman, cat, salad, <badly ocr'd text>"
https://amp.ebaumsworld.com/pictures/woman-yelling-at-cat-memes/86009016/
@Orb2069 @Phosphenes @datarama @felipe @ben If the OCR worked better, I’d be able to (probably) tell exactly what that is. Woman/cat/salad with two lines of text is absolutely the Woman Yelling At Cat meme format. It would require some prior knowledge though.
On the flip side, you’d think they would run images through a reverse image search and tag hits on meme templates. I get hits which for it which have the title of the meme in text
@ben I mean
user contributions licensed under CC BY-SA
I'm not a lawyer, but I don't think you can do anything about it, they're technically hosting a copy of your content with attribution to you, which doesn't make you an owner of the data, in particular this clause:
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
gives them right to fuck their userbase in the ass by using the data in other services
@13xforever They're then selling that data to OpenAI which does not abide by this license. I'm not getting attribution there, and they're not licensing it as CC-SA-BY whch is required.
@ben@m.benui.ca Chaotic evil: send in an anti-circumvention DMCA notice for each question. Those have no process for disputing, so they will probably just delete your content and ban you, because it is easier.
@ben@m.benui.ca The enshittification will continue until the morale improves.
Thank you for the replies. As someone pointed out, anything posted on Stack Overflow is covered by CC BY-SA 4.0.
Under this license all usage must attribute the author and must have a similar license. Neither of which OpenAI fulfills.
@ben "AI is a lying machine made out of crimes."
https://www.tiktok.com/@alex_falcone/video/7366006020352642347
@ben i haven't read their tos but are you sure that it doesn't include licensing whatever you say to stackoverflow? the last paragraph of the page you shared seems to allude to that
i mean, it's still immoral as heck but i guess that's one of the reasons we're all here instead of on a centralized content farm
@ben if only there were a word for taking things you don't own. 🤔
Gosh it would make talking about gen AI easier if we had a word for that. 🤔
@AeonCypher @datarama @felipe @ben
Please, mr. Reply guy, tell me about the inevitability of AI.
When you're done, explain to me how you reliably achieve +95% accuracy on k-fold validation without undetectable overfitting - my prof never could provide a simple answer, and 1-out-of-20 seems like really not good odds for a new god.
Also that CC claims that training an AI on data is "fair use". So fuck Creative Commons I guess.
https://creativecommons.org/2023/02/17/fair-use-training-generative-ai/
@ben #Funfact: all "#learning" is #FairUse, otherwise you'd be a perpetual #DebtPeon to ]whoever made your schoolbooks and created whatever media you ever consumed](
http://felixreda.eu/2021/07/github-copilot-is-not-infringing-your-copyright/ ) !
@ben I'm longing for a new set of free (as in beer) software and creative licenses that prevent all this garbage.
I put my software out there so other people can use it, I'm even ok if they make money out of it. But I'm not ok with my work being swallowed by a big machine so that people can print money without even knowing it exists at all.
@ljs @datarama @Orb2069 @ben @felipe
Are you saying to trust me? I'm not a 'him'.
I'm quite strongly against OpenAI. What you are saying is quite the opposite of what I said.
The comment above continues to be an irrelevancy. A strung together set of jargonizations.
No one builds LLMs with k-fold validation. OpenAIs models are, likely intentionally, overfit. Which is why they are full of exact copies of data.
However, again, whatever you two think you're arguing against it's not related to a position I hold.
@tdr @kkarhan @ben Perhaps I can clarify, as I wrote the article. § 44b UrhG is the German transposition of Art. 4 DSM copyright directive, which I cover in the article: “Since the EU Copyright Directive of 2019, … where commercial uses are concerned, rightsholders who do not want their copyright-protected works to be scraped for data mining must opt-out in machine-readable form”, so although Germany had not adopted §44b yet, the article takes it into account.
Alas it is always the Luddite question is it not?
Ask not what the machine does but to whom and for who's benefit?
AI should be creating a better future for the benefit of all, and mostly for those of dire needs. Instead it reaps the benefits for the fat cats above, and indulges in the #enshitification of our reality.
An you've wrote "in a perfect world" - I don't think this should be considered in such terms. That should be our normal one.
@Orb2069 @ljs @datarama @ben @felipe
Are you accusing me of being a bot. Kindly go fuck yourself.
I actually work with the technology and actively work _against_ the corporate powers trying to monopolize it.
You on the other hand are spewing jargon you do not understand in order to look smart, and fearmongering about something you know nothing about.
@AeonCypher @ljs @datarama @ben @felipe
What a strange non-sequitor.
I wonder if you're actually trying to understand something, or if I should simply block you.
@AeonCypher @datarama @Orb2069 @felipe @ben
Not until it solves its energy consumption problem!