Conversation

Lorenzo Stoakes

Edited 15 days ago
I'm really excited by the userspace testing I've introduced into the VMA code recently (I am risking sounding like I am tooting my own horn here - I am excited by the _concept_ to be clear :)

It allows you to compile the vma code independent of the kernel altogether as part of a userland program, which can then execute fundamental VMA operations such as splitting/merging and manipulating VMAs in a virtual address space as you want.

This is amazing because firstly the compile time is basically instant, you can compile with -O0 and get actually reliable gdb, you can run sanitisers so you can pick up on any memory leak, use-after-free, etc.

But it is the _identical_ code that the kernel uses (albeit - with stubbed functions).

This allows for incredibly fine-grained unit tests including adding regression tests very, very easily.

It also allows for exciting new possibilities - for instance, if an issue arises in the kernel, you could take a snapshot of the state of the address space and load it up in userspace to manipulate and repro.

You can also do things like fuzzing millions of different combinations of function invocations very very quickly.

We've had a number of issues around VMA code that have arisen from subtle bugs that were missed by the bots, missed by any other tests and were picked up by... rpm. That's right, rpm happening to hit certain code paths.

So this represents a really big step forward in making the code more reliable, secure and conveniently testable.

It also encourages patch series that manipulate this code to add unit tests as if it's 2024 or something :)
4
3
18

@ljs pretty cool
i think large parts of netbsd can be compiled in to run in userspace, iirc usb drivers, made to run atop libusb or filesystems with libfuse. i think there was even a similar sort of effort for linux?
like this: https://lwn.net/Articles/639333/

1
0
1
@lkundrak ah yeah nice, wouldn't quite work for this as the functions are intentionally not exported and static etc.

I remember at the startup we were talking about using that, but we did a lot of dumb shit back then just so the founders could show off how clever they were...

Anyway it is a neat idea, not sure if implemented.

You can run tests in uml, but you'd not have access to the internals in the same way, and you'd also not be able to stub out (a key bit) in the same way.

The VMA stuff is sort of uniquely well suited to this, it's abstracted heavily and you can't necessarily invoke certain actions except through situations that cause them (e.g. mapping one VMA next to another so they merge etc.)
1
0
1
@lkundrak it was agony to make this work btw if you were wondering
0
0
2

@ljs that's beautiful. it reminds me of how much I wish more huge programs were split into little independently usable and testable libraries. web browsers in particular would benefit themselves (from targeted testing/fuzzing) and the world (by being able to reuse all the interesting shims from inside FF and Chrome) so much from this approach

1
0
1
@migratory thanks!

It's all based on Liam's great work in making the maple tree stuff userland testable, without that I could never have done it.

I definitely do like the concept in general, it works really well in practice, even if it can be a total pain to make it happen in reality.

The way we do it is kinda funny, there's an #include "mm/vma.c" which defers its imports to a vma_internal.h file which we replace in the userland version via header guard.
0
0
1

@ljs it's definitely the type of thing for which you should to sound your own horn, more than it seems reasonable to you (i.e with talks, article, toots, private discussions, bathroom stickers…).

1
0
3
@Aissen haha thanks.

Very much based on Liam's great work with making the maple tree userland testable.

I may do a blog post about it, still need to integrate into self tests (by having them build + invoke I think) or otherwise get into CI mechanisms.

The first real tests are in my recent series to remove vma_merge(), which I wouldn't even have attempted without some level of testability (very very subtle stuff), before that was more a skeleton implementation.
0
0
1

@ljs That's great stuff :) I had a similar situation where the best way to debug a tree-based data structure was to copy out the relevant code and stub it to run in userspace, and then write a script to generate the specific data structure from the data in the core dump. That way I could step through & instrument the code and quickly fix the issue!
https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc

2
0
3
@brenns10 thanks + oh nice that's really cool too!

Yeah similar idea, based on Liam's hard work on making the maple tree stuff userland testable.

It's so much easier to work through problems in this way, I figured out a failure that somebody reported SO quick with it. Then added a regression test... :)

It opens up a lot of exciting possibilities
0
0
1
@brenns10 really cool use of drgn there.

Kind of interesting to see how that could interact with vma-y stuff, maybe take a snapshot (necessarily racey though)

Do need to fiddle about with drgn at some point...
1
0
1

@ljs Yeah, races are possible (and likely in some cases). For some subsystems, mm & slab for example, the act of reading /proc/kcore will cause side effects that alter state... But it still works well enough for most things.

If you do end up playing with drgn let me know, I'd be glad to help out!

1
0
2
@brenns10 @ljs but I'd you're reading from a core dump, there's no more risk of races ;)
0
0
2