Posts
15
Following
258
Followers
274
Open source hacker, Linux kernel developer, and creator of the drgn debugger. Chicano 🇲🇽. Likes punk rock.
The next Linux Kernel Debugging Tools Monthly Meeting is tomorrow, Wednesday, May 28th at 11:30 AM Pacific time. As usual, see the agenda on the linux-debuggers mailing list: https://lore.kernel.org/linux-debuggers/aDYy37NBkt7mdaJY@telecaster/T/#u and let me know if you want an invite.
0
1
1

What's that mysterious workaround?

Core Huff6 decode step is described in https://fgiesen.wordpress.com/2023/10/29/entropy-decoding-in-oodle-data-x86-64-6-stream-huffman-decoders/

A customer managed to get a fairly consistent repro for transient decode errors by overclocking an i7-14700KF by about 5% from stock settings ("performance" multiplier 56->59).

It took weeks of back and forth and forensic debugging to figure out what actually happens, but TL;DR: the observed decode errors are all consistent with a single instruction misbehaving.

3
5
2
@vitaut I'm pretty sure we've had some of those bubble up as kernel bug reports lol
1
0
1
@brenns10 yeah, it's also an ideal test case in that it's somewhat mechanical. Maybe a bigger corpus of drgn helpers out there would help
1
0
0
@brenns10 I am extremely amused by Cursor's attempts at a drgn helper, thanks for sharing lol
1
0
2

I've hinted at this a bit, but it's finally to a point where I feel comfortable with other people using it.

I've spent the last 8 months iterating on different ways to debug really large and complicated applications. systing is the tool that's come out of this work. I wrote a post about it introducing it, and there's some documentation in the repo for it. It's really just designed for people like me who have a pretty deep knowledge of the kernel and how userspace interacts with the kernel. Hopefully my fellow kernel developers find this useful.

https://josefbacik.github.io/kernel/systing/debugging/2025/05/08/systing.html

0
8
1
@ljs @brenns10 FYI the drgn repo has a contrib directory for anyone to dump their scripts in: https://github.com/osandov/drgn/tree/main/contrib. Totally up to you wherever you want to keep them, obviously
0
0
2
@brauner nice! I think it'd be pretty cool to set the coredump handler to a drgn script that could directly extract only the desired information without needing to capture a full core dump at all. I think I saw this scroll by in your discussion with @jann, but I assume /proc/pid/mem, /proc/pid/maps, etc. are still alive at that point? In particular, drgn would love to use /proc/pid/map_files/ so that it unambiguously gets the correct files, but that requires PTRACE_MODE_READ_FSCREDS, so it couldn't be fully unprivileged
1
0
0
@ljs @vbabka @brenns10 @jann nah, we don't grab locks. Since we're read-only (ignoring kmodify), the worst that can happen is we follow some stale data and get a Python exception. But kmodify could be used for that if you really wanted.

Really cool that you made the kernel code work in userspace! I wished the Btrfs code could do that when I was working on Btrfs.

The tricky parts for using that code for the in-kernel data are 1) you need to translate memory accesses to reads from /proc/kcore 2) you need to match the appropriate version of the userspace code to the kernel version.
0
0
3
@ljs @vbabka @brenns10 nope, we walk the maple tree (or red black tree on old kernels) ourselves: https://github.com/osandov/drgn/blob/89260b18b9fa01ffc7d5bfaedeb34aeec0198557/drgn/helpers/linux/mm.py#L1353. We have drgn helpers for walking both data structures, which are basically translations of the C kernel code. The maple tree ones were pretty fun: https://github.com/osandov/drgn/blob/89260b18b9fa01ffc7d5bfaedeb34aeec0198557/drgn/helpers/linux/mapletree.py#L67.

It wouldn't be _too_ hard to translate those to libdrgn, but it'd be tedious.

One other idea for C integration that comes to mind is adding a function to libdrgn that evaluates a string of Python code for you, that way you don't need to resort to system(). We could even have it return the drgn_object to C so you could use libdrgn to get the rest of the way. Would that help, or am I missing the point?

P.S. drgn does have a way to call kernel functions: https://drgn.readthedocs.io/en/latest/helpers.html#kmodify. But that's provided as a way for the user to shoot their own foot, it's too risky to use internally.
0
0
2
@ljs @vbabka @brenns10 the big caveat is that all of the helpers (e.g., for looking up VMAs) are indeed Python-only. Stephen and I have both had crazy ideas for using CO-RE or transpiling a subset of C to make things easier for people more comfortable with C like yourself, but that's probably a ways away.
0
0
3
@ljs @vbabka @brenns10 no worries!

> Yeah libdrgn looked pretty dead to me in-repo so that's sad.

Almost all of drgn's core functionality is implemented in libdrgn first and only exposed via Python bindings, so it is very actively developed.

But @brenns10 is right, although we do have some internal users, it doesn't have a stable API or ABI and isn't installed by default.

That's mainly because no one has asked for that yet. I'd be very willing to do a bit of work to close the gap if it's important to you or anyone else.
0
0
2
@supersat @q I remember when you were asking for old boarding passes (to try to figure out how they were generating nonces, maybe?)
1
0
0

Christian Brauner 🦊🐺

Edited 1 month ago

I've done a series that adds support for AF_UNIX sockets in coredumps. Userspace provides an AF_UNIX socket path via core_pattern and the kernel connects to it, shuts down the read side and writes the coredump to the socket.

This means no more super privileged usermode helper upcalls and makes for a very nice API experience. I captured coredumps simply via socat:

https://lore.kernel.org/20250430-work-coredump-socket-v1-0-2faf027dbb47@kernel.org

The receiver can use SO_PEERPIDFD to get a stable handle on the crashed process.

6
12
1
@pinskia should that be [[gnu::noreturn]] volatile asm or am I misunderstanding the intention of the attribute?
1
0
4
@brauner maybe eu-elflint/elflint from elfutils? Not sure off the top of my head how much it validates in core dumps
0
0
1
@monsieuricon I found this Akkoma PR which I thought looked merged: https://akkoma.dev/AkkomaGang/akkoma/pulls/405 but maybe I'm mistaken.
0
0
0
@monsieuricon does social.kernel.org not support verified profile links or am I dumb?
1
0
0
Show older