The board is confident that the process has resulted in a definition that meets the standards of Open Source as defined in the Open Source Definition and the Four Essential Freedoms, and we’re energized about how this definition positions OSI to facilitate meaningful and practical Open Source guidance for the entire industry.” https://opensource.org/blog/the-open-source-initiative-announces-the-release-of-the-industrys-first-open-source-ai-definition #OpenSource
@osi It does not meet the definition whatsoever and you know it. It promotes violation of every single actually-Open-Source license out there and you know it. But you wanted a piece of the latest scam pie and the people who actually make Open Source (a term I'm likely to stop using now) don't matter in the slightest to you.
I see where you’re coming from, and I appreciate the clarity and depth of your concerns around licensing and Open Source definitions in AI training. I’d love to get your perspective on something I’ve been thinking about, which is how similar AI training data use could be to how humans learn.
When people read or are exposed to various works, even proprietary or confidential information, they incorporate this knowledge broadly rather than attributing specific ideas. In a way, we might even retain key insights from trade secrets or copyrighted material without an explicit obligation to give attribution every time a related idea is expressed.
If AI is working similarly—relying on approximations of knowledge rather than precise lookups—then, arguably, the output isn't a reproduction but more of a unique synthesis or restatement of learned concepts.
Does this approach seem too different from human learning to be applicable to AI? Or do you think AI, by the nature of its structure, necessitates a stricter adherence to the source material in terms of attribution and licensing?
That's fair. That related read was fascinating. Poor dude. It reminded me of that one AI company where, after a while, their AI gets bored and chooses to do other things, like look at pictures, instead of doing the assigned workload.
I do wish training data was more opt-in; I've also thought some sort of royalty scheme would work well.
I'm still developing my stance on the topic, thanks for the response and I hope you continue communicating your stance.
@taschenorakel @josh @michelin @osi I think the standard term for that is “shared source”, i.e. source-available but not under an OSS licence.
@mirabilos Sure, but why not use a term that fits better and is easier to understand, once you've heard it?
@taschenorakel @josh @michelin @osi it’s more ambiguous; “public source” is too close to “public domain”
@mirabilos @taschenorakel @josh @osi I prefer FOSS/ FLOSS really. And yeah I'm increasingly against using permissive licensing especially for projects where I have some say