I've been helping the team that have brought up https://sashiko.dev/ for AI generated LKML reviews. It's really impressive to see the wider set of issues that the AI reviews can bring up, and this will help with the quality of code in the Linux code base. #sashiko_dev #linux
Roman was responsible for https://sashiko.dev/ , he's posted an announcement about it now on LKML: https://lore.kernel.org/lkml/7ia4o6kmpj5s.fsf@castle.c.googlers.com/
"Sashiko was able to find 53% of bugs based on a completely unfiltered set of 1,000 recent upstream issues using "Fixes:" tags (using Gemini 3.1 Pro)."
@irogers nice, thank you for your work!
Is it possible to monitor extra Linux kernel mailing lists (like mptcp@lists.linux.dev) by any chances? If yes, where can we ask? :)
@irogers this is really neat! I started playing with it to see how well an open weight model like Kimi K2.5 does on the included benchmarks.
Is the team looking at introducing additional tools (static analysis, clang-format, even a built kernel binary + qemu + debugger?) as part of the process?
@asb So sashiko uses prompts and with the patches the AI needs to answer questions like what locks are held by callers of this function? Most tools like this end up falling back on grep a lot. There is semcode integration as semcode allows comparisons between git branches and to answer questions like what calls a function through use of TreeSitter. Clang-analyzer could be a replacement to TreeSitter but I have questions about how this like ifdefs are handled.
@asb to determine what locks are held there is work to get -Wthread-safety into the Linux kernel. Automating debugging is on a number of people's radars, including ours.
@irogers I was thinking more static analysis or shell-based tools for poking a live system, running a compiler etc. Maybe it's different for kernel reviews, but mapping to what I'd find useful for LLVM reviews, if I were trying to be the most helpful reviewer possible I think I'd want a build of the before/after of the patch, when I think there's a bug I'd try to write an input that exhibits it so I can feed it back to the author etc.
@asb I know from vibe choosing that this is possible. I've seen Gemini write a patch, write a test. Build and run the test, insert extra printf debugging...
@irogers absolutely. Sashiko for now is exposing only a very limited set of tools https://github.com/sashiko-dev/sashiko/blob/896718ae10058713059d17aec9fef57d370270f0/src/worker/tools.rs which probably makes a lot of sense to keep things focused initially. But I think there is scope in the future for adding more prompts + exposed tools in areas where actually "kicking the tires" as part of the review might be helpful.
@asb yeah. I believe syzkaller has 1000s of open issues. There's lots of scope to automate and make things better. Unfortunately dealing with lkml can be hard work.
@regehr @irogers here are my notes (I guess from a week ago already) https://gist.github.com/asb/c4ecf2ebb55570ce63168b8248ab5f2d from the smallest sashiko benchmark set and a somewhat "vibed" setup.. But I need to return to this now https://github.com/sashiko-dev/sashiko/pull/21 landed and try something a bit more thorough. Early indications were promising even if output format issues tripped it up. I'm interested to try with GLM-5 / 5.1 too. Kimi is via a subscription at https://synthetic.new/
@regehr @irogers my starting point had been reading https://lwn.net/Articles/1063303/ which raised the question of what if Google stop contributing the compute, but I felt it was a shame it didn't characterise how much compute is needed per review. I hadn't looked at the Sashiko prompts before - it is a more heavyweight (in terms of token count) process than I might have guessed.