Thinking out loud: is there a way to get systemd to pass a D-Bus system bus socket FD to a service when it’s bus activated (i.e. via `LISTEN_FDS`)? Would mean the service could sandbox AF_UNIX socket connectivity (if it only needed that to connect to the bus, which I guess is true for some bus daemons). Downside ottomh: system bus connection policy would be bypassed (though it allows all connections by default) and the auth would still need to be done by the service.
(I’m probably missing something obvious)
@pwithnall I talked to @pid_eins years ago about this for Sysprof because we would want it for portable services to pre-own a name and hand off the connection.
@chergert @pwithnall @pid_eins This is still an idea floating around. Letting systemd hold your name on the bus and then handing over an already initialized connection via socket activation. It would be a new type of unit, .busname or so
@AdrianVovk @chergert @pwithnall @pid_eins or would could move to varlink 🙃
@swick @AdrianVovk @pwithnall @pid_eins
I wrote a varlink implementation last year and I should probably write about the two-dozen or so issues I had with it on the protocol level.
@chergert @swick @pwithnall @pid_eins Please do write it down somewhere so it can be discussed and resolved :)
As far as I understand, systemd has taken over ownership for Varlink and has the latitude to iterate on the protocol
@AdrianVovk @swick @pwithnall @pid_eins
It can't be fixed unless it is completely different. Its good enough for it's purpose, but that is almost certainly not anything application wise.
@AdrianVovk @swick @pwithnall @pid_eins
Separating the protocol framing from the payload is essential for performance cost accounting because otherwise you must read (and parse) the _entire_ message before dispatching to workers.
It would be better for that cost to get associated with the RPC handler rather than socket code.
This makes profiling performance issues much nicer beyond just being more hygienic.
@AdrianVovk @swick @pwithnall @pid_eins
Not being able to multiplex means for anything non-trivial you now need one connection to manage a gaggle of other sockets. Hello complexity and cross-socket ordering races based on timing of socket read.
e.g. the same thing trying to avoid
It is also difficult when you cannot bump FD limits due to a library using select() somewhere. So now your multiplexing is further limited.
App is responsible for failure cases, so this is unmanageable.
@chergert @AdrianVovk @swick @pwithnall uh, sockets are cheap on Linux. The concept on purpose has no handshake protocol so that you can keep multiple sockets open so that you have efficient, simple concurrency, but still have strict ordering between related notifications where you want it. A lot of D-Bus code in the wild creates a connection, does one method call and closes it (systemctl and such tools, or NSS resolution code).
@AdrianVovk @swick @pwithnall @pid_eins
But some good things are that you can do probably lower latency things, priority inheritance between processes, etc.
All good things for cross-daemon communication and early init.
So fine to live there, but I definitely wouldn't want it growing outside of that, no matter how readable shoving tee in the middle is.
@AdrianVovk @swick @pwithnall @pid_eins
Another good thing was the protocol interface description. Extremely easy to write a recursive decent parser for.
@AdrianVovk @swick @pwithnall @pid_eins
Another bad thing in framing is "more" leaks the framing information into the application APIs. It sort of just turns out gross.
For typed APIs, it sucks because you need to create an enumerator for consuming everything even if they never get used that way.
And since it's in the framing protocol, not much you can do about it, esp if you want to generate proxies.
@AdrianVovk @swick @pwithnall @pid_eins
I had significant trouble creating clean call-site code. This of course could just be my lack of imagination/inspiration, but having done a lot of RPC in my career I suspect half of the reason is the protocol.
And I've dealt with bad protocols. Having to implement BSON/mongodb wire protocol there was some of the worst.
@AdrianVovk @swick @pwithnall @pid_eins
The protocol explicitly denies depth greater than 1000 which sure, but that limits the types of structures you can send across just because the framing of the protocol didn't separate framing from payload.
I think this should be separated and/or pushed off to the decoder per-RPC so that you can have much shallower restrictions too.
@AdrianVovk @swick @pwithnall @pid_eins
Not having intermixed events and methods is extremely obnoxious at an application standpoint. You now need a socketpair for events and another socketpair for methods.
And of course you only get one event stream so better handle parsing the typed events on the other side and dispatching.
@chergert @AdrianVovk @swick @pwithnall that pattern is absolutely awful on D-Bus, because of the excessive penalty of D-Bus handshakes. It's really nice with Varlink, since connections are cheap, independent, bottleneck-free.
@chergert @AdrianVovk @swick @pwithnall sorry, but you are holding it wrong. You want to use Varlink like D-Bus. Terrible idea. Connections are cheap, fds are cheap, and parallelization is good. 2 sockets is not a minus, its plus.
@pwithnall there are somewhat detailed hashed out plans for that somewhere, we discussed that many times with dbus-broker folks, but nobody actually sat down to implement them. These days I am pretty sure Varlink is the better, simpler alternative for most cases though, from my side I hence doubt I would myself put any further energy into this (though I am happy to review a patch).
@chergert @AdrianVovk @swick @pwithnall but varlink does separate framing from payload: the NUL byte separating messages is not a JSON concept. You hence just scan for the NUL byte for framing and then what you got between is the payload. (Or not sure what you are getting at)
@chergert @AdrianVovk @swick @pwithnall sorry, but multiplexing os the worst of ideas of D-Bus. A performance killer and the global ordering is pretty useless. For performance you want multipme independent channels and for correctness you need local ordering only. That's what Varlink delivers...
@chergert @AdrianVovk @swick @pwithnall and designing your IPC for select() is a really really really terrible idea. in 2025 select() should be made illegal. The Linux kernel and systemd made sure we raised all fd limits to very high values so that fds are now cheap. Embrace it! You have to, because all new kernel apis rely on that too (i.e. pidfds and so on). New kernel apis typically just hand out fds for things (see bpf apis for example where programs, objects, and even attachments are all...
@chergert @AdrianVovk @swick @pwithnall encapsulated by fds, precisely because they are cheap now. Hence get over your 1990s UNIX image: fds are not a scarce resource anymore. They are just handles, and you can have a lot of them.
@pid_eins That makes sense, thank you for the replies! :) I’m afraid I’m not in a position to write a patch for it; I was mostly curious as to whether I was missing something obvious (and it seems not). The daemon is already written so is sticking with D-Bus not varlink, but this thread is full of things for me to bear in mind for the next daemon I write :)
@pid_eins @AdrianVovk @swick @pwithnall
It's not that I want to use it like D-Bus, it's that I want a consumable API that doesn't make all surrounding code more complex.
@chergert @AdrianVovk @swick @pwithnall oh, varlink is so much easier and simpler because you can keep all calls separate, and do not have to multiplex, filter, and so on. Unlike D-Bus where you have connections, object paths, service names/unique names, fds that are all required to find your way to a destination, in varlink you just have a socket addeess and one fd. That is *drastically* simpler. Conceptually and in code.
@pid_eins @chergert @swick @pwithnall local ordering per client or per call? If I subscribe to two different streams of events, it is occasionally useful to know the order these events are received in amongst each other
@AdrianVovk @chergert @swick @pwithnall local ordering per stream of calls. Generally the expectation though is that if you subscribe to sets of notifications you do this via a single call so that you get a single stream back. Hence what you want to subscribe to must be conceptually known at the moment where you implement a service, because you have to return a single stream of events with everything in it. But that should generally be fine, D-Bus kinda implies that too actually
@pid_eins @AdrianVovk @swick @pwithnall
That only works when you're guaranteed to have framing data _before_ your argument data (which you are not).
@chergert @AdrianVovk @swick @pwithnall so i take it you want to know the size of the next oncoming message first so that you can allocate a properly sized memory buffer for it and don't want to copy, or want to decode already while parsing? I think that's a bogus optimization attempt, because memory copying is the one thing that is ridiculously fast on modern cpus (unlike roundtrips for example which D-Bus is forcing so ridiculously many of). But more importantly what you theoretically gain..
@pid_eins @AdrianVovk @swick @pwithnall
But if anyone is interested in running with it, here is where I put together a prototype but stopped after I realized I absolutely didn't want Varlink code in my applications.
It does a lot of the tricks I did to try to speed up GVariant for building typed messages.
@chergert @AdrianVovk @swick @pwithnall .. here during msg input you'd lose during message output, because you have to marshall to text first, then determine the size so that you can write it our first. Its a zero sum game, and hence pointless...
@pid_eins @AdrianVovk @swick @pwithnall
I certainly make _most_ of my apps bump at startup to the max. But there are a few cases where I can't (things that link against libimobiledevice for example).
@chergert @AdrianVovk @swick @pwithnall in systemd we vet our deps, and if we see a dep that uses select() then it's pretty obviously not up to today's standards and not acceptable to use.
I am pretty sure usage of select() should be considered at bug in that library, just like any other bug. It's 2025 ffs.
@pid_eins @chergert @swick @pwithnall I think the idea is to make it possible to separate the payload from the message. Strip off the framing and then forward it elsewhere
Over in an issue Christian gave an example along the lines of:
{"method": "DoSomething", ...}\0{"arg1": "asdf", "arg2": "foobar"}\0\0
@AdrianVovk @chergert @swick @pwithnall json has subobjects, its really nicely nestable. I doubt there is too much value in being able to split things on the byte level like this. I mean usually json objects are complex, and if you forward parts of a varlink message elsewhere then its highly unlikely you want to do that precisely at the toplevel message level rather than on some level below i.e. one of the passed args or even a part of a passed arg. Or to turn this around:
@AdrianVovk @chergert @swick @pwithnall if you want all func params as one then it's more likely you also want the func name along with ot than not. Hence having a binary "cut" between method name and params seems weird and unneccessary to me.
@AdrianVovk @chergert @swick @pwithnall also having a binary cut vs a json level cut between finc name and params only is relevant if you assume marshalling is where you lose performance. But that's bs. You lose performance because of roundtrips (and thats so so so so bad on dbus) and on the strict ordering/strict single threading that dbus implies. Hence: dont lose yourself too much in marshalling questions, focus on the stuff that really matters, i.e. the more general structure of the IPC.
@pid_eins @chergert @swick @pwithnall Welp, easier said than done. The dependency tree of $insert GUI app here is going to be a lot more complicated than systemd's. Though it probably wouldn't hurt to figure out which of our dependencies use select and then file issues
@AdrianVovk @pid_eins @swick @pwithnall
The libimobiledevice one was particularly tricky because it could get pulled in from GVFS and then all the other modules (which had no idea they were "linking" against it via dlopen()) go womp womp.
@chergert @AdrianVovk @swick @pwithnall if you really really want to use such a legacy library my suggestion would be to do it out of process, so it doesnt fuck up the rest of the codebase. And of course, if its open source just fix the damn libray. And if its not open source, then it should be out of process anyway...
@pid_eins @AdrianVovk @swick @pwithnall
Yeah I agree, but there are libraries we still have out there that break it. And it's obnoxious now that we could be using 240 dma-buf / sec + the grip of them used in Mesa vulkan caches. We regularly bump up against 1024 in GTK apps now.
@chergert @AdrianVovk @swick @pwithnall any non-trivial program should bump soft rlimitnofile to hard rlimitnofile during early initialization and never look back.
(Glib should probably help with that and in particular auto-reset the soft limit for spawned processes back to 1024 for compat/safety.)
@chergert @AdrianVovk @swick @pwithnall i am kinda tempted to add a thing to systemd where via bpf we automatically complain about use of legacy apis such as select() in the logs. As an attempt to push people to update their code...
@pid_eins @AdrianVovk @swick @pwithnall
Let me be clear, I don't like D-Bus either.
But I don't want to go against the grain unless I can be certain that alternative direction is going to supplant the status-quo.
And this all feels very unfinished from the application API standpoint.
We've worked so hard in GTK 4 to focus on API ergonomics and work backwards to the technology. This feels quite the opposite.
@chergert @AdrianVovk @swick @pwithnall my experience is quite the opposite. After having written both sd-bus and sd-varlink and very complex apps using both, and knowing both protocols quite well, i am very strongly of the opinion that writing varlink is so so so so much nicer and quicker and simpler. The fact how the number of varlink apis in systemd almost exploded in a very short time while our dbus apis kinda stagnate in their comprehensiveness is document to that.
@chergert @AdrianVovk @swick @pwithnall for the first time it feels like adding an ipc api to some low level os functionality is a 1h job from writing, over testing, to submitting, while for dbus it was more of a 1 *day* job if you were lucky, and more if not.
@pid_eins @AdrianVovk @swick @pwithnall
What are the plans for observability? Do we need to create eBPF programs to snoop? If so, can we get any attributes on sockets intended for varlink?
I ask with both Sysprof and D-Spy hats on.
(Clearly we can't interject `tee` in the middle system-wide).
@chergert @AdrianVovk @swick @pwithnall right now i trace varlink stuff mostly via simple strace. It's quite sufficient to me, since the messages are immediately readable, and transactions directly recognizable by their fd and isolated on syscall param level already. Hence so far I haven't needed more complex tracers as would be appropriate for a datagram based multipeer protocol such as dbus where message must be decoded before being readable and put into transactional context explictly.
@chergert @AdrianVovk @swick @pwithnall but yes, if we want varlink support for those kind of tracers we'd need to make varlink sockets recognizable somehow (xattrs would be perfect for that but right now kernel explicitly refuses xattrs on S_IFSOCK inodes), and then pick them up via a simple bpf prog.
@pavel @AdrianVovk @swick @pwithnall @chergert poll() has been a pretty universal since a long time. It's in posix and all the bsds have it, and so does macosx since 2005. I think coverage is pretty much complete. If you run archaic software, sure you'd get these logs, but they'd just be logs, so can be ignored...