social.kernel.org

Conversation

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

All it would take for AI to completely collapse is a ruling in the US saying these companies have to licence the content they used to train these tools.

They simply would never reach a sustainable business model if they had to fairly compensate all the people who wrote, drew, edited, sang or just created the content they use.

Simply being forced to respect attribution and licenses would kill them. Will that ruling ever happen? Maybe not. Should it? I think so.

27

11

5

🌱 Ligniform

ligniform@infosec.exchange

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP copyright laws are all so outdated (in the US anyway, according to most youtubers I've listened to the topic on).

It'd be good to see a complete overhaul now that everyday people can make content seen by millions.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @ligniform@infosec.exchange

@ligniform I completely agree. If only because for once, it would also protect small creators and artists, not just giant companies!

1

0

0

Le Papier Blanc

lepapierblanc@mastodonczech.cz

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP
They would just move to other language corpuses, no?

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @lepapierblanc@mastodonczech.cz

@lepapierblanc They would either have to pay the people who make the content, or use completely copyright free / license free material, which would basically render them pretty useless.

0

0

0

Mahbub Hasan

mahbub@fosstodon.org

Reply to @thelinuxEXP@mastodon.social

Big companies when they see someone using their 57 years old 2 second long sound effect: GO TO JAIL

Big companies stealing every bit of creative content from the internet without permission from the small creators: ageblobcat

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @mahbub@fosstodon.org

@mahbub « It’s different, we’re not copying the content, we’re creating something derivative so it’s ok », they say, as they refuse to acknowledge licenses

0

0

0

Sheldon

sysop408@sfba.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP what about non American or non-Western entities though? As much as I don't like the idea of American firms scraping everything to produce products using our work without paying us, I'm even less fond of the idea of China taking over and marching ahead without competition.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @sysop408@sfba.social

@sysop408 These companies are mainly US-based, and I would argue the US is the biggest repository of works they use, so this would put a stop to most efforts.

I would also love to see rulings in other areas of the world, though. I live in the EU, and I would be very happy to see the European Commission making it illegal to use EU produced content to train AIs without licensing rights.

0

0

0

ivt

ivt@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP To play the devil's advocate a bit here, but people also learn in a similar way. You have to read to learn how to write. You have to listen to music to learn how to make your own, etc.

I think there are at least 2 main differences. The first one is that a human can only produce so much work on their own, while AI can mass produce.

0

0

0

Gary "grim" Kramlich

grimmy@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP I would be very surprised if that ruling ever came.

0

0

0

Dror Bedrack

DrorBedrack@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP their CURRENT business model is unsustainable. They are all losing a lot of money

0

0

0

Diego Pino

dpino@mastodon.social

Reply to @thelinuxEXP@mastodon.social

AI has destroyed the symbiotic relationship that existed between content creators and search engines, there's no retribution loop anymore. The current state of AGI is of parasitism. Without incentives for creating new content, who is going to create new content in the future? The retribution loop needs to be restored somehow.

0

0

0

Xucaen ☮️💚🌿

Xucaen@mastodon.social

Reply to @thelinuxEXP@mastodon.social

I am in complete agreement with this

0

0

0

Corb_The_Lesser

Corb_The_Lesser@mastodon.social

Reply to @thelinuxEXP@mastodon.social

Edited 1 year ago

@thelinuxEXP Not sure how/if it could be implemented but legislation requiring AI scrapers to identify themselves would allow servers to block them.

Web content doesn't make itself. Someone made it and owns it. (That remains true with AI-generated content.) Establishing a right to *not* have your content scraped, and implementing the opt in/opt out switches, would be an excellent approach.

(In my view, the right to *not* have content scraped is inherent in copyright.)

0

0

0

SkyLuke

SkyLuke@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP I wish laws applied equally to everyone. If we aren't going to do IP, we should get generic drugs NOW. If we are going to do it, AI should pay for the content.

0

0

0

Rage Rumbles 🏴‍☠️🫂 🔞

black_flag@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP Nick, you're talking about #capitalism as a system not just the AI bit of it lately come to fruition.

If CAPITALISM had to "fairly compensate" everyone who makes it work it would fall apart.

0

0

0

Clot

clot27@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP AI is premature, shouldn't have become mainstream just yet, so it is a *must*

0

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to

@remenca That’s not at all what is happening though, is it?

0

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to

@Abercrombie Sure. But I should get a say if my personal data and health data is used to train this.

And this is one good use case among many pretty bad ones.

0

0

0

Nachiket Vartak

vartak@mastodon.online

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP The first problem here you will have in a legal sense is to prove that your work was used to train a model. There is pretty much no way to trace original individual training samples from a transformer model. So you lose right there…Even if a law existed that licenses had to be respected, it is unenforceable.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @vartak@mastodon.online

@vartak The NYT proves that pretty competently already, ChatGPT can just spit out entire parts of their articles ;)

0

0

0

Nachiket Vartak

vartak@mastodon.online

Reply to @thelinuxEXP@mastodon.social

Edited 1 year ago

@thelinuxEXP This is trickier than you are making it out to be. When an object is used to train a network, it isn't being copied. But information regarding that object is captured in the network 'anonymously' and 'abstractly'. So, as an analogy - you definitely own your beard. But do you also have a right to a picture of your beard that I took in the wild? Or if someone wrote an article describing a beard that looks like yours... Do you also own that article?

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @vartak@mastodon.online

@vartak I do own the rights to a picture of my beard that you took, yeah ;) That’s the general rule for pictures of people and buildings

1

0

0

Q. Edwards

the_q@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP I agree, but I don't think it will happen. The LLMs have all already been trained on stolen data. It's a knot that can't be undone at this point. There will be a lot of hand wringing and yelling, but in the end the corporations and *their* government lackeys will just hand-wave any grievances and then "promise" not to do it again in the future knowing full well they absolutely will.

In the end we're all to blame though. We clicked "I agree" on every social media platform.

0

0

0

apemantus

apemantus@ieji.de

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP The “content creator” bubble is bursting.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @apemantus@ieji.de

@apemantus It’s not though. There was never any bubble in the first place. There were people who made content for ridiculously small payouts, and a really tiny fraction making a lot of money.

0

0

0

Hirad

hirad@m.hirad.it

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP honestly, I don't think that's necessary. Training a LLM isn't the same as using copyright materials. That's like saying if I copy paste your this post into a text file on my computer requires me to pay you for it!
Instead, I'd argue to give incentives to companies to release their LLMs publicly, Like Meta and Mistral do.
Unless you are truly looking for killing generative AI, in which case, we can't have any discussion. But I can say throughout history, every new tech had faced people who thought it was their duty to destroy that technology no matter the cost.

2

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @hirad@m.hirad.it

@hirad That’s not the same at all, though, is it? Because they’re not just copying content, they’re selling access to a tool that uses that content, that they grabbed without attribution, without respecting licensing either.

It’s not the same as personal use from an individual ;)

0

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @hirad@m.hirad.it

@hirad I don’t want to destroy it, I want these tools to respect what they trained on, which currently they don’t.

I’m not even affected yet, AFAIK, but the argument that it’s just like copying a file doesn’t work, and never did. A company selling a product doesn’t use the same rules as an individual for their own use, that’s never been the case :)

0

0

0

Ken Kinder

bouncing@twit.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP @vartak That’s definitely not the rule, Nick. If it’s in public, it’s legal to photograph and the photo belongs to whomever took it.

Barbra Streisand learned that rule the hard way.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @bouncing@twit.social

Edited 1 year ago

@bouncing @vartak Nope. Try to sell a picture of the Eiffel Tower, or a painting displayed publicly, or to publish a video of people walking in the street without their consent, and see how fast you’ll have to pay damages ;)

0

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to

@remenca Ah yeah, that’s really the biggest AI model or tool everyone is hearing about right now, and absolutely the direction most commercial AI tools are going…

0

0

0

Ken Kinder

bouncing@twit.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP That sounds a lot like you don’t think there should be LLMs at all. At least western ones.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @bouncing@twit.social

@bouncing Well, no. I think they are useful, but they need to compensate people for using their work, just like every other industry had to do before them. If they can’t make that work as a sustainable business model, then yeah, they can go.

1

0

0

GVF 🇨🇦

i_gvf@mastodon.world

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP I have used every one of those words in a previous post.

You will be hearing from my lawyers.

It all depends on how the information is reused.

If whole passages are copied, that is copyright infringement. But using a collected works to learn what is proper is basically how humans do it on a much larger time scale.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @i_gvf@mastodon.world

@i_gvf Except one human doesn’t compare to the scale of a giant model that learns 10000 times faster from millions of sources. You can’t apply the law of one human being to a giant data center, it doesn’t work.

0

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to

@ikanreed That’s very likely

0

0

0

Earthy Tonez

mikkergp@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP One of the sometimes positive things about Capitalism is that it is an adversarial system, so these decisions don't happen in a vacuum, and it is interesting to wonder whether and why these new AI companies have more leverage/influence/power than media companies.

0

0

0

crazyeddie

crazyeddie@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP I would be surprised if it doesn't fall under "fair use" doctrine. We wouldn't want to do away with fair use, which lets us quote each other and learn and apply new techniques without asking permission. Requiring licensing and such for AI training would need to show that the output of that training is derivative and seeing as that it's learning in ways very similar to the way we do...that could be problematic. It's a big, complex issue.

0

0

0

Elengale

elengale@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP An alternative would be if USPTO decided that generated content could not be copyrighted.

No company or VC form would touch the stuff ever again. It'd live on, but I'm a very diminished manner.

0

0

0

Joe

not2b@sfba.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP That would collapse OpenAI, but companies could obtain enough legally licensed and useful data to build new models.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @not2b@sfba.social

@not2b And that would be much better!

0

0

0

🌱 Ligniform

ligniform@infosec.exchange

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP it gives power to everyday people who make art/videos/other content so I have my doubts it'll happen, but it'd be a nice change.

1

0

0

walter4096

walter4096@mastodon.gamedev.place

Reply to @ligniform@infosec.exchange

@ligniform @thelinuxEXP

AI will let everyday people make better videos, i.e. instead of an artist painting individual images, they can storyboard short films.
fleshing out game worlds takes huge amounts of art.

Also there's plenty of gaps AI can't handle yet, better to focus on those (vs holding back a new capability)

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @walter4096@mastodon.gamedev.place

@walter4096 @ligniform I’m not saying that’s bad, I’d love AI assistants to make my hideo production easier and better! I just don’t think this is worth stealing content from people :)

1

0

0

walter4096

walter4096@mastodon.gamedev.place

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP @ligniform

it only works well when it's trained on huge volumes.
the more it's trained on the more general it gets, the less likely it is to be overfit

copyright on specific things still applies, e.g. I can draw an x-wing fighter becsuse my brain learned from seeing it, but I can't sell that. Can't we treat AI the same way?

"everything is a remix" kind of anyway. darth vader=samuri helmet+respirator, xwing =dragster+dart, etc..

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @walter4096@mastodon.gamedev.place

@walter4096 @ligniform Just because it only works on stolen content doesn’t make it ok :)

2

0

0

Pavel Machek

pavel

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP Come on, copyright is overreaching enough already. Plus, this would effectively give Facebook and China monopoly on big language models. Does not sound great.

0

0

0

walter4096

walter4096@mastodon.gamedev.place

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP @ligniform

is it really 'stealing' if its just doing what we do - learning from what it sees.

there's already a concept of derived & transformative works in copyright law to account for this sort of thing.

The training process is deriving generative rules from the data rather than copying it.

anyway if you enforce a strict interpretation govt/rival states would still use it against us (propoganda, weapons, general purpose robots).

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @walter4096@mastodon.gamedev.place

@walter4096 @ligniform This learning argument is fallacious at best. It’s not like it’s one human learning, and using that for themselves.

It’s an automated system doing that at a gigantic scale and built by a company for profit. Not comparable at all ;)

2

0

0

walter4096

walter4096@mastodon.gamedev.place

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP @ligniform

"built by a company for profit", there's opensource AI models aswell, and we could crowdfund training runs.

stable diffusion on my PC could generate 10,000 images per day. you can give it sketches to add detail, more controlable

isn't it better if everyone has this multiplier (images, text, code, motion..)?

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @walter4096@mastodon.gamedev.place

@walter4096 @ligniform No, not at the price of the hard work of artists.

0

0

0

Duco

duco@norden.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP I understand that you aren't happy about them using such content but where do they violate licenses? Aren't they using material publicly available on the internet? Licenses maybe forbid to copy or distribute it, but to read it or learn from it? I don't think, that any license forbids that.

2

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @duco@norden.social

@duco The GPL says that all code built upon it needs to be GPL. I would argue all copilot generated code should thus be GPL.

Some licenses require attribution even for derivated works. No AI does any attribution.

0

0

0

Ken Kinder

bouncing@twit.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP Maybe. You could make a pretty persuasive argument that LLM training is fair use, as it’s transformative.

There are also examples of society deciding that it’s important not to require an explicit individual license: https://en.wikipedia.org/wiki/Compulsory_license

Also worth pointing out, you can opt-out of LLM training with a simple robots.txt entry.

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @bouncing@twit.social

@bouncing Fairness use is a case by case thing, there is no blanket definition of it. So every generated result would have to be judged individually in relation to how transformative in regards to all the works it used :) Basically impossible

0

0

0

walter4096

walter4096@mastodon.gamedev.place

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP @ligniform

We never had this automated remix abiilty in the past. It is a waste of human hands and minds to do things manually that a machine do.

This is a new reality. a PC can generate 10,000 new unique guided images per day.

In that world its not worth doing the same kind of 2D art. but art skills wont go away.

Real artists will get far more out of AI tools than me.
I look forward to their AI movies!

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @walter4096@mastodon.gamedev.place

@walter4096 @ligniform I’m not discussing the usefulness.

But « it’s so useful and practical » is not a good argument for appropriating all that content without thinking about the people who created it, who it belongs to, or its license. It was never an argument.

At that point, I could say it’s OK to steal a billionaire’s money because I would use it to solve world problems. That argument doesn’t work, usefulness doesn’t come before everything else.

1

0

0

walter4096

walter4096@mastodon.gamedev.place

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP @ligniform

this theft argument ..
literally everything everyone does is influenced by traces of what they've seen.

star wars was patterned on 'the hidden fortress', elements of the standard "hero's journey", and a lore lucas wrote because he couldn't get the Flash Gordon license.

it's copyrighted (yes I can't sell x-wing fan art) but to say you can't train on it is silly when the elements all come from elsewhere

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @walter4096@mastodon.gamedev.place

@walter4096 @ligniform No, it’s not silly at all. It’s absolutely logical and normal to say that it’s its own thing, even if it’s based on something else.

This is a completely weird argument to make. Yes, everything is based on something else, it doesn’t mean it has no intrinsic value and thus belongs to everyone??

0

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @duco@norden.social

@duco Basicslly « publicly available » doesn’t mean free of charge or of restrictions to use.

YouTube videos are publicly available, yet you’re not allowed to download them, it breaches the ToS. I can find an image from Getty in Google search, doesn’t mean I can use it freely on my website ;)

0

0

0

walter4096

walter4096@mastodon.gamedev.place

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP @ligniform

if you see the original work reproduced, you can complain.

its pointless fretting about this.

artists skills produce more if they go into 3D (zbrush sculpts), and into storyboarding.
i'd love to see movies of hyperion.. expanse seasons 7-9 .. star wars EU.. re-imaginings of Blakes 7, space 1999.. this can all happen in an AI world where 1man+$2000 can make a film (and $20 can do a 30 sec trailer to generate interest if they dont have $2000)

1

0

0

Nick @ The Linux Experiment

thelinuxEXP@mastodon.social

Reply to @walter4096@mastodon.gamedev.place

@walter4096 @ligniform I don’t understand at all this viewpoint, sorry.

I entirely disagree with the premise, and the result 😂

0

0

0

Joachim Tuchel

Jtuchel@mastodon.social

Reply to @thelinuxEXP@mastodon.social

@thelinuxEXP doesn’t most of this somewhat apply to search engines as well?

0

0

0

About social.kernel.org

Terms of service

Please do not use this service in violation of the Linux Kernel Code of Conduct. Doing so will result in your account suspension with the referral of the matter to the CoC committee.
"Repeating"/"boosting" someone else's status on this platform will be treated as endorsement and will fall under rule #1.
You are encouraged to use this platform to promote your work on the Linux Kernel, but there is no restriction on permitted topics (with the exception of anything covered by #1 above).
There is no requirement to post in English, but it should be considered the primary language of communication on this platform.

Privacy notice

The admins of this service have access to all posted statuses. They aren't looking, but if it's something they shouldn't know about, then you should not post it on this platform.

Please see the Linux Foundation Privacy Policy, which applies to this platform as well.

Getting your own account

If you would like an account on this instance, please check that the following applies to you:

You are listed in MAINTAINERS or CREDITS
OR: You have a kernel.org account or email address
OR: You have a long and established history of involvement with the Linux Kernel

If the above is true and you agree with the Terms of Service and Privacy Notice listed above, please use these instructions to request an account:

How to request an account on social.kernel.org