Conversation
This "untrusted data" patch series from Benno Lossin is the result of conversations at last weekend's Rust Linux kernel conference in Copenhagen:

https://lore.kernel.org/all/20240913112643.542914-1-benno.lossin@proton.me/

It's not a "silver bullet" for why we should be using rust in the Linux kernel, but it is a "big giant sledgehammer" to help squash and prevent from happening MANY common types of kernel vulnerabilities and bugs (remember, "all input is evil!" and this change forces you to always be aware of that, which is something that C in the kernel does not.)

I had always felt that Rust was the future for what we need to do in Linux, but now I'm sure, because if we can do stuff like this, with no overhead involved (it's all checked at build time), then we would be foolish not to give it a real try.

And yes, I've asked for this for years from the C developers, and maybe we can also do it there, but it's not obvious how and no one has come up with a way to do so. Maybe now they will have some more incentive :)
5
122
166

@gregkh just a curious question, as I see you as an expert in the field, say all regular kernel coders and time to time contributes would port their stuff to rust, what would you estimate the shortest time you think it would take to make the Linux kernel 100% rust (excluding time that it takes for everyone to learn rust, we just assume they know it tomorrow)

Are we talking months/years/decades ?

2
0
0
@aho Others have done research on how long it would take to reimplement code bases based on their size and importance, see that research for details.

In short, it's not going to happen, and no one is asking for it to happen. Just evolve like normally and all will be fine. The Linux kernel you run today has almost no code that was in the kernel you used 25 years ago, so why would it have the same code you use 25 years from now?

Except for the tty layer, that beast is almost identical to what was around in the beginning, and probably will outlive us all...
1
1
22

@gregkh In a way 25 years feels far away, but don't feel that long ago I begun with my first RedHat 6.2 installation...

0
0
0

@gregkh this nerd sniped me so bad, because I think we really, really need this

dropped a bunch of thoughts for polishing, thinking about how gpu drivers would use this on our ioctl data structs ...

0
0
0

@aho @gregkh Iā€™ve got a real life metric: it took me I think 2 months of late nights perhaps 8 hours per night to rewrite Doom in rust. That was MVP. 20k lines. And Iā€™ve continued hacking away at it and improving for a couple years now with maybe a few hours per month

0
0
0
In the same topic of "use frameworks to make bugs very hard to create", Alice Ryhl's patches for using a "range" api to access data from userspace:

https://lore.kernel.org/r/20240913210031.20802-1-aliceryhl@google.com

along with examples of how recent binder bugs were affected by this issue in C, and also were present in the Rust implementation, along with a proposal for how to prevent that are another good example of how the language can help us in kernel land by creating apis to help us do the right thing.
0
25
61

@gregkh

The API introduced in this series is not a silver bullet, users are
still able to access the untrusted value (otherwise how would they be
able to validate it?). But it provides additional guardrails to remind
users that they ought to validate the value before using it. As already
stated, they can access the value directly, but to do that, they need to
explicitly call one of the untrusted_* functions signaling to
reviewers that they are reading untrusted data without validation.

this does not seem to indicate that anything is being checked at build time? is there a part of the patch that demonstrates the zero-overhead build-time checking you describe? or is your point that the rust for linux people are receptive to these concerns and other kernel devs aren't? i'm confused by "this change forces you to always be aware of that, which is something that C in the kernel does not" when the part i quoted very explicitly says it is not a silver bullet and just provides additional guardrails (which is obviously useful, i'm not contesting that)

1
0
0

@gregkh do you have a link to where you've asked for this sort of thing before?

1
0
1

@gregkh not altogether clear to me that the Validator impls for tarfs (https://lore.kernel.org/all/20240913112643.542914-4-benno.lossin@proton.me/) cannot be done in c by exposing named accessor methods. build-time checking is obv the thing c really can't do but it's not clear to me how this patch demonstrates that and i assume i'm missing something

0
0
1

@gregkh
> but it's not obvious how and no one has come up with a way to do so. Maybe now they will have some more incentive :)

Not sure if this is what you want, but there is __attribute__((tainted_args)) since gcc 12.

1
1
2
@uis oooh, nice, and the documentation for it says it is for something like "a system call in an operating system". Odd, who added it to the compiler and why didn't they talk to any kernel developers about it if this feature is supposed to be for us?

Is there a different operating system out there that uses newer versions of gcc as their primary compiler that is using this?

That being said, it's a good start, and will require us to use -fanalyzer which I think people are working toward, so maybe there is hope!
1
0
5