First talk of the day with Greg @gregkh KH to talk about Untrusted data in in Linux : How Rust is going to save us.
I'm at the recording for the Rust In Production podcast on "Oxidizing the Linux Kernel" with Greg @gregkh KH, Alice Rhyl and Matthias @mre Endler.
#RustWeek #RustWeek2026 #LinuxKernel #RustLang #RustForLinux
What is never? It's a Rust type that's pretty simple to define (not).
"If Linux can be maintained by sending patches to an email mailing list, 'doesn’t work at scale' arguments are skill issues."
https://dbushell.com/2026/04/29/github-is-sinking/
Typical ML argument: "If I can read something legally, why can't I train an LLM on it?"
Humans are capable of reading things and later writing a similar thing that is still a copyright violation. If I go and write a book that follows the plot line of Star Wars, that's still a copyright violation, even if no text is literally the same. If I play the melody to a song on my piano and release it without the appropriate mechanical cover license, that's also a copyright violation.
The reason this does not happen often is that, as humans, we are aware that that's plagiarism and there are rules. Sometimes it happens by accident, and people still get sued and lose.
LLMs have no such awareness and routinely output things which are blatant copyright violations when appropriately prompted. That means the model weights encode that work, and therefore, are themselves a derivative work.
Your brain encodes a massive amount of copyrighted information. You are not a walking copyright violation because humans aren't data, can't be copied and distributed en masse, have human rights, etc. This is why "mind reading machines" are a classic dystopian plot point (monetizing your thoughts etc).
An LLM is not a human, does not have human rights, nor human privileges. It is data, and if it encodes copyrighted information, that's a derivative work. If you aren't following the license of the training data, that's a copyright violation.
@dascandy everything graphed risks becoming a goal in itself, so I make sure we graph everything 😀