Conversation
Edited 10 months ago
# v6.4 linux-riscv recap thing

Round 3 of these "recap" things. I've found them to be both useful and fun to
write - with a couple of months gone by since this stuff actually got merged by
Linus, it is easy to forget it all!

Starting the SoC support side of stuff again, since that's seemingly the
pattern I have set for myself, v6.4 didn't really see all that much in terms of
the SoCs that were already supported. Some minor changes for both the PolarFire
SoC and Allwinner D1 landed, but nothing interesting.
The main change was the addition of support for the StarFive JH7110 & their
first-party VisionFive 2 SBC. Drivers have been merged so far for pinctrl, clk
reset & mmc - enough to boot a basic system to a console.

On the architectural front though, it feels like it has been a busy merge
window. Looking at Palmer's tree before the merge window opened, it looked like
there was more new content in there than we'd had for the last few releases.
Hopefully that's more than a feeling, but rather a "symptom" of having some
more people contributing reviews and patchwork/automated test becoming more
useful.

A change that will be helpful to those who implement their own RISC-V CPUs as a
hobby or learning experience, support was added for 32-bit kernels without an
MMU. In terms of code, there was little to change, but it is yet another niche
combination that is unlikely to be tested much by developers.

A long-running series that was merged in Palmer's first PR for v6.4 was generic
entry support. This means RISC-V now uses the common, cross architecture code
for entering the kernel during syscalls.
TODO: Bjorn, help.
Guo Ren pushed this effort over the line, with Jisheng Zhang contributing some
patches in that series too. Björn Töpel had a hand in things too from a review
point of view, and in fixing things when we discovered systemd was broken after
the series was applied!

Qinglin Pan's Svnapot support series landed too, Svnapot being the ability to
use naturally-aligned-power-of-two page sizes. I don't know of any actual
systems that support this extension, but perhaps Institute of Software Chinese
Academy of Sciences (ISCAS) have something in the works there.

Palmer & his Rivos colleague Evan Green introduced a new "hwprobe" syscall.
This syscall uses key-value pairs to probe for both things like misaligned
access support & what extensions are available on a platform.
"Just use the ISA string" you might say, but unfortunately some backwards
incompatible changes there prevent it from being sufficiently reliable for
use in the userspace API or ABI.

Drew Jones and Alex Ghiti were the most active developers this time around,
with Alex mainly working on "mm" type things & Drew on extension support for
Zicboz and other cleanups.
Alex's 16 changes included support for setting the paging mode from the command
line, for a 64-bit relocatable kernel & a rework of RISC-V's KASAN support.
The relocatable kernel is a pre-requisite for enabling KASLR & involves creating
a special bit of the kernel that is is effectively a copy of some kernel
commandline (and therefore FDT) parsing functions that is without the various
instrumentation etc that will figure out whether to relocate the kernel or not!
I *think* the latter should allow the syzkaller fuzzer to start working again
for RISC-V. I hope that it doesn't trigger a deluge of problems!
Drew added support for the aforementioned Zicboz extension. Zicboz is comprised
of a single instruction, used to zero a cacheblock, which Drew has used to add
an optimised routine for clear_page(). Along the way, he contributed a rake of
cleanups for the scaffolding surrounding ISA extensions.

Vector has once again missed the merge window, although this time around it was
tantalisingly close, with some of the Rivos folk theorycrafting a situation,
involving userspace scheduling, where we would have broken userspace
compatibility. The fix should be straightforward, so hopefully it makes it for
v6.5!

The second PR of the merge window was minimal, mostly containing a few fixups
for the hwprobe series & other compilation issues. The only feature in the PR
was by StarFive's Sia Jee Heng, adding support for hibernation/suspend to disc.

Speaking of hibernation, issues with that patchset interacting with another
that reworked how we map memory lead to hibernation support being locked behind
the "NONPORTABLE" config option. "NONPORTABLE" is used for options that may
result in a kernel that will not work correctly on cross platform basis.
Specifically here, OpenSBI dropped the "no-map" devicetree property from the
reserved-memory nodes it uses to communicate its PMP protected regions to
supervisor-mode software. Thanks to the changes in how we map memory, the first
2 MiB of memory (where OpenSBI typically resides) is now visible to the kernel
and when it tries to hibernate will access the PMP protected region.
There were quite a few mails back and forth about how to deal with this. The
naive (like me!) would think that if firmware wants to protect itself it should
use "no-map". An RFC patch has been sent to change how all archs handle reserved
memory in the face of suspend etc & we shall see where that goes, but for now
hibernation has been marked non-portable & only for use with systems that
actually protect their reserved regions!

The other thing that has occupied time during the release candidates is dealing
with some of the fallout from enabling relocatable kernels. Early alternatives,
alternatives being our way of patching/modifying the kernel at runtime to work
around errata/differences in extension support, sit in a weird spot where they
are called from pre-mmu C code and need some special treatment. In v6.4-rc1,
with the relocatable kernel support, systems with early alternatives would fail
to boot. The interim fix for this is just building such files with -fno-pie, but
that is going to be very fragile and needs a real fix. Unfortunately the real
fix for this does not seem to be straightforward - one proposal is to relocate
once, in physical memory, do early alternative patching and then relocate
again to the final (virtual) memory addresses. Ouch. The other proposal is to
split out these early alternatives & build them into the section of the kernel
that controls the relocations. Neither sounds particular fun to me...

Why is it so much easier to talk about bugs/issues than about features?!?
1
10
12