You hear that Mr. Kernighan? ... That is the
sound of inevitability... It is the sound
of improvement...
It's time, Mr. Kernighan....
Step 1: While debugging a kernel issue, you want the return value of a function at startup.
Step 2: Instead of just adding a printk()
and recompiling, you try tracing like the cool kids.
Step 3: Since 2023, the function graph tracer can show return values. You add to the kernel command line and reboot: ftrace=function_graph ftrace_filter=interesting_fn trace_options=funcgraph-retval
Step 4: The trace shows no return value. You find that ARM return value capture exists only for AArch64, not ARM32.
Step 5: You add a printk()
and rebuild.
Yes, to all of it.
https://lore.kernel.org/all/20250809192156.GA1411279@fedora/
What's that mysterious workaround?
Core Huff6 decode step is described in https://fgiesen.wordpress.com/2023/10/29/entropy-decoding-in-oodle-data-x86-64-6-stream-huffman-decoders/
A customer managed to get a fairly consistent repro for transient decode errors by overclocking an i7-14700KF by about 5% from stock settings ("performance" multiplier 56->59).
It took weeks of back and forth and forensic debugging to figure out what actually happens, but TL;DR: the observed decode errors are all consistent with a single instruction misbehaving.
I've hinted at this a bit, but it's finally to a point where I feel comfortable with other people using it.
I've spent the last 8 months iterating on different ways to debug really large and complicated applications. systing is the tool that's come out of this work. I wrote a post about it introducing it, and there's some documentation in the repo for it. It's really just designed for people like me who have a pretty deep knowledge of the kernel and how userspace interacts with the kernel. Hopefully my fellow kernel developers find this useful.
https://josefbacik.github.io/kernel/systing/debugging/2025/05/08/systing.html