Conversation

5️⃣ Here's the 5th installment of posts highlighting key new features of the upcoming v257 release of systemd.

Since its beginnings systemd was a heavy user of the D-Bus IPC system. It provides D-Bus APIs, it calls D-Bus APIs it schedules activation of the D-Bus broker, and even provides its own C D-Bus client library.

However, since early on our use of D-Bus was not without various major problems. One of the biggest goes something like this:

1
5
1

D-Bus' model is built around a central broker daemon, which is started during boot, but unfortunately relatively late (i.e. together with other, regular daemons instead of early boot or the initrd). However, systemd brings up the system as a whole and hence needs IPC from earliest moment on.

And then there are various components of systemd that the D-Bus broker relies on (i.e. consumes functionality of) and hence cannot themselves provide their services on D-Bus, …

1
1
0

…, in order to avoid a chicken/egg problem, a cyclic dependency, and deadlocks. (Example: journald provides logging to the D-Bus broker, and hence cannot provide APIs via D-Bus. Similar PID 1 itself, or systemd-userdb/systemd-homed which provide user record resolution which D-Bus needs for its policies, and so on and so on).

These problems are very hard to tackle. For example in PID 1 itself we provide our D-Bus APIs not just via the broker, but also via another local "direct" socket.

1
0
0

The latter sucks in major ways, since we basically had to reimplement a subset of the broker ourselves, with message multiplexing, subscription, signal matching and a lot of other stuff. Because this was so messy we never did the same for journald, userdbd or homed.

These two are just the biggest issues with D-Bus, but there are a lot more, in my eyes. Hence, quite some time ago we started to use a different type of IPC for these cases, initially just internally.

1
0
0

That alternative IPC is called Varlink (https://varlink.org/). It has been around for a while, and initially we only adopted it where D-Bus was just too bad to use, and only internally. Over the last couple of releases that changed however: we started to make heavier use of it and provide public interfaces via Varlink in addition or instead of D-Bus.

In many ways Varlink is much nicer to work with than D-Bus: it's a lot simpler, it's brokerless design make it a ton faster, …

1
1
0

…, it's JSON use make it more conceptually compatible with the rest of the world and various other things.

It's also a lot easier to write Varlink services than D-Bus, because it allows you to handle each connection in a different process, thus being compatible with codebases that do not have event loops (D-Bus due to its multiplexing forces you to process all messages within the same process, and due to the global ordering within a single event loop).

1
0
0

To give one example, "bootctl" is a small tool that installs the systemd-boot boot loader into the ESP for you. It's a command line tool that synchronously copies a bunch of files into the target mount. We always had the plan to turn that into a D-Bus service, but never actually did it, because doing that is pain: we'd have to turn it into an event loop driven thing, which is just nasty for something so simple that just copies some files.

In a Varlink world, the problem goes away:

1
0
0

we just let systemd's socket activation logic listen on an AF_UNIX/SOCK_STREAM socket, and then let it fork off a new bootctl instance for each connection. That instance then just processes that connection and is done. And it's easy: it just does what it usually does, but instead of reading the commands to execute from the command line it just reads them from a small JSON object it gets from STDIN. And it just writes its output as JSON to STDOUT, done.

In fact, because bootctl already…

1
0
0

…supported JSON output anyway, the output side was done pretty much anyway.

Anyway, there are many other stories like that.

Suffice to say, in v257 there are now 19 Varlink interfaces/services, which we added in a short time, for various things that never had them before when D-Bus was our sole focus, because it was so nasty to add that.

(For comparison: we provide only 11 D-Bus API services at this time).

2
0
0

There are various bindings for Varlink available from the Varlin project, but with systemd v257 we now make systemd's own Varlink implementation available too: sd-varlink.

sd-varlink has been around for quite a while, and is quite well tested (including fuzzed) by now hence. It's already driving your systemd installations, except mostly internally so far.

Since Varlink uses JSON for marshalling its messages, sd-varlink comes with a companion API: sd-json. It's another C library for JSON.

2
0
0

You are of course right if you say that there are already so many of those, why another? And you'd be right. My answer to that is that sd-json is much nicer to use than most, since it doesn't just allow you deal with JSON objects, it also helps you with building them from C data structures, and to map JSON objects back to C data structures. It also helps you with higher level operations that the low-level JSON datatypes leave you in the cold with:

1
0
0

i.e. for example handles base64 encoding/decoding for handling binary blobs within JSON automatically, or it helps you with dealing with JSON's >53bit integer problem, and various other things.

Right now, documentation for sd-varlink and sd-json is scarce (one could even say "barely existing"), but there are plenty of real-life examples in the systemd source tree, of course.

1
0
0

At Linux' best conference, All Systems Go! 2024 in Berlin this year I gave a (brief) talk about Varlink, and why you should consider it. If you want to know more about the concept, this might be a good starting point:

https://media.ccc.de/v/all-systems-go-2024-276-varlink-now-

And that's all for now, enjoy!

1
4
1

@pid_eins
Would it be feasible at any point to rip out at least some of the pre-existing native dbus APIs and provide them via a small bridge which does translation to/from varlink?

1
0
0

@srtcd424 The semantics are too different. I see no advantage of that.

systemd will speak both IPC interfaces in the future.

I'd expect new APIs are probably going to show up more in Varlink than in D-Bus though.

0
0
0

@tvaughan they didnt *really* do varlink though. They didnt understand the concept of a type system and just pushed strings around containing arbitrarily formatted stuff inside of varlink/json strings. I mean, sure it was a form of "adoption" of varlink, but a really terrible one, that it think noone is missing.

1
0
0

@pid_eins I don’t disagree but the solution to those problems didn’t have to be “drop varlink.” I don’t know that they were solved by the rest api that replaced varlink. Podman could have been a driver in wider varlink adoption

0
0
0

@pid_eins Before world-conquest IMO we still need a spec of the protocol. It doesn't need to be very detailed (like D-Bus) but still something implementers can use. We'll want bindings for many languages than currently supported. Even many C people would like a gobject-based API and it'd be hard for people to be motivated to write such from-scratch libraries, if they don't have a spec to follow.

So if you really want to push for varlink, I'd suggest creating and publishing that.

1
0
0

@zeenix @pid_eins A real spec would be really helpful so we could distinguish between valid messages and things that just so happen to be accepted by existing implementations.

1
0
0

@jamesh @pid_eins Indeed. I didn't even think about this aspect.

1
0
0

@zeenix @pid_eins basing certification on whether an implementation produces messages that can be consumed by other implementations and consume messages they produce kind of makes it look like there isn't any distinction.

As an example, {"method": "foo", "method": "bar"} is valid JSON, and will likely be accepted by most (all?) implementations. Most will probably also decide that you meant to call the "bar" method. It probably shouldn't be considered a valid message though.

1
0
0

@jamesh @zeenix frankly, I don't think we should second guess JSON on this. i.e. I think your criticism there is valid, but not pointed towards Varlink so much but it should be pointed to JSON. And in fact I-JSON is an RFC that addresses these issues a bit. And to my knowledge all relevant JSON implementations actually follow similar rules on this, or make very clear that behaviour is undefined in some cases.

2
0
0

@jamesh @zeenix Hence, yes I think it would have been good if JSON was defined in stricter terms. But I don't think that Varlink is the place to try to address that. Consult I-JSON for that.

0
0
0

@pid_eins @zeenix I'm not suggesting that implementations should be required to reject messages with duplicate object keys: there's not much value in that, and it may not even be possible with some JSON libraries.

Rather that conforming implementations won't _produce_ messages with duplicate keys. That's something that would need to be in the spec, and wouldn't necessarily be caught by the current cross-compatibility certification process.

1
0
0

@jamesh @zeenix but that too is already covered by the I-JSON RFC:

https://datatracker.ietf.org/doc/html/rfc7493#section-2.3

It uses pretty strong wording:

"Objects in I-JSON messages MUST NOT have members with duplicate names. In this context, "duplicate" means that the names, after processing any escaped characters, are identical sequences of Unicode characters."

0
0
0