social.kernel.org

Conversation

Philip Withnall

pwithnall@mastodon.social

Thinking out loud: is there a way to get systemd to pass a D-Bus system bus socket FD to a service when it’s bus activated (i.e. via `LISTEN_FDS`)? Would mean the service could sandbox AF_UNIX socket connectivity (if it only needed that to connect to the bus, which I guess is true for some bus daemons). Downside ottomh: system bus connection policy would be bypassed (though it allows all connections by default) and the auth would still need to be done by the service.

2

0

0

Philip Withnall

pwithnall@mastodon.social

Reply to @pwithnall@mastodon.social

Edited 3 months ago

(I’m probably missing something obvious)

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @pwithnall@mastodon.social

@pwithnall I talked to @pid_eins years ago about this for Sysprof because we would want it for portable services to pre-own a name and hand off the connection.

1

0

0

Adrian Vovk

AdrianVovk@fosstodon.org

Reply to @chergert@my.devsuite.app

@chergert @pwithnall @pid_eins This is still an idea floating around. Letting systemd hold your name on the bus and then handing over an already initialized connection via socket activation. It would be a new type of unit, .busname or so

1

0

0

Sebastian Wick

swick@fosstodon.org

Reply to @AdrianVovk@fosstodon.org

@AdrianVovk @chergert @pwithnall @pid_eins or would could move to varlink 🙃

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @swick@fosstodon.org

@swick @AdrianVovk @pwithnall @pid_eins

I wrote a varlink implementation last year and I should probably write about the two-dozen or so issues I had with it on the protocol level.

1

0

0

Adrian Vovk

AdrianVovk@fosstodon.org

Reply to @chergert@my.devsuite.app

@chergert @swick @pwithnall @pid_eins Please do write it down somewhere so it can be discussed and resolved :)

As far as I understand, systemd has taken over ownership for Varlink and has the latitude to iterate on the protocol

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @AdrianVovk@fosstodon.org

Edited 3 months ago

@AdrianVovk @swick @pwithnall @pid_eins

It can't be fixed unless it is completely different. Its good enough for it's purpose, but that is almost certainly not anything application wise.

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

Edited 3 months ago

@AdrianVovk @swick @pwithnall @pid_eins

Separating the protocol framing from the payload is essential for performance cost accounting because otherwise you must read (and parse) the _entire_ message before dispatching to workers.

It would be better for that cost to get associated with the RPC handler rather than socket code.

This makes profiling performance issues much nicer beyond just being more hygienic.

2

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

Edited 3 months ago

@AdrianVovk @swick @pwithnall @pid_eins

Not being able to multiplex means for anything non-trivial you now need one connection to manage a gaggle of other sockets. Hello complexity and cross-socket ordering races based on timing of socket read.

e.g. the same thing trying to avoid

It is also difficult when you cannot bump FD limits due to a library using select() somewhere. So now your multiplexing is further limited.

App is responsible for failure cases, so this is unmanageable.

4

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall uh, sockets are cheap on Linux. The concept on purpose has no handshake protocol so that you can keep multiple sockets open so that you have efficient, simple concurrency, but still have strict ordering between related notifications where you want it. A lot of D-Bus code in the wild creates a connection, does one method call and closes it (systemctl and such tools, or NSS resolution code).

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

@AdrianVovk @swick @pwithnall @pid_eins

But some good things are that you can do probably lower latency things, priority inheritance between processes, etc.

All good things for cross-daemon communication and early init.

So fine to live there, but I definitely wouldn't want it growing outside of that, no matter how readable shoving tee in the middle is.

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

@AdrianVovk @swick @pwithnall @pid_eins

Another good thing was the protocol interface description. Extremely easy to write a recursive decent parser for.

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

Edited 3 months ago

@AdrianVovk @swick @pwithnall @pid_eins

Another bad thing in framing is "more" leaks the framing information into the application APIs. It sort of just turns out gross.

For typed APIs, it sucks because you need to create an enumerator for consuming everything even if they never get used that way.

And since it's in the framing protocol, not much you can do about it, esp if you want to generate proxies.

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

Edited 3 months ago

@AdrianVovk @swick @pwithnall @pid_eins

I had significant trouble creating clean call-site code. This of course could just be my lack of imagination/inspiration, but having done a lot of RPC in my career I suspect half of the reason is the protocol.

And I've dealt with bad protocols. Having to implement BSON/mongodb wire protocol there was some of the worst.

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

@AdrianVovk @swick @pwithnall @pid_eins

The protocol explicitly denies depth greater than 1000 which sure, but that limits the types of structures you can send across just because the framing of the protocol didn't separate framing from payload.

I think this should be separated and/or pushed off to the decoder per-RPC so that you can have much shallower restrictions too.

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

@AdrianVovk @swick @pwithnall @pid_eins

Not having intermixed events and methods is extremely obnoxious at an application standpoint. You now need a socketpair for events and another socketpair for methods.

And of course you only get one event stream so better handle parsing the typed events on the other side and dispatching.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@chergert @AdrianVovk @swick @pwithnall that pattern is absolutely awful on D-Bus, because of the excessive penalty of D-Bus handshakes. It's really nice with Varlink, since connections are cheap, independent, bottleneck-free.

0

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall sorry, but you are holding it wrong. You want to use Varlink like D-Bus. Terrible idea. Connections are cheap, fds are cheap, and parallelization is good. 2 sockets is not a minus, its plus.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pwithnall@mastodon.social

@pwithnall there are somewhat detailed hashed out plans for that somewhere, we discussed that many times with dbus-broker folks, but nobody actually sat down to implement them. These days I am pretty sure Varlink is the better, simpler alternative for most cases though, from my side I hence doubt I would myself put any further energy into this (though I am happy to review a patch).

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall but varlink does separate framing from payload: the NUL byte separating messages is not a JSON concept. You hence just scan for the NUL byte for framing and then what you got between is the payload. (Or not sure what you are getting at)

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall sorry, but multiplexing os the worst of ideas of D-Bus. A performance killer and the global ordering is pretty useless. For performance you want multipme independent channels and for correctness you need local ordering only. That's what Varlink delivers...

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall and designing your IPC for select() is a really really really terrible idea. in 2025 select() should be made illegal. The Linux kernel and systemd made sure we raised all fd limits to very high values so that fds are now cheap. Embrace it! You have to, because all new kernel apis rely on that too (i.e. pidfds and so on). New kernel apis typically just hand out fds for things (see bpf apis for example where programs, objects, and even attachments are all...

2

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@chergert @AdrianVovk @swick @pwithnall encapsulated by fds, precisely because they are cheap now. Hence get over your 1990s UNIX image: fds are not a scarce resource anymore. They are just handles, and you can have a lot of them.

1

0

0

Philip Withnall

pwithnall@mastodon.social

Reply to @pid_eins@mastodon.social

@pid_eins That makes sense, thank you for the replies! :) I’m afraid I’m not in a position to write a patch for it; I was mostly curious as to whether I was missing something obvious (and it seems not). The daemon is already written so is sticking with D-Bus not varlink, but this thread is full of things for me to bear in mind for the next daemon I write :)

0

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @pid_eins@mastodon.social

@pid_eins @AdrianVovk @swick @pwithnall

It's not that I want to use it like D-Bus, it's that I want a consumable API that doesn't make all surrounding code more complex.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

Edited 3 months ago

@chergert @AdrianVovk @swick @pwithnall oh, varlink is so much easier and simpler because you can keep all calls separate, and do not have to multiplex, filter, and so on. Unlike D-Bus where you have connections, object paths, service names/unique names, fds that are all required to find your way to a destination, in varlink you just have a socket addeess and one fd. That is *drastically* simpler. Conceptually and in code.

0

0

0

Adrian Vovk

AdrianVovk@fosstodon.org

Reply to @pid_eins@mastodon.social

@pid_eins @chergert @swick @pwithnall local ordering per client or per call? If I subscribe to two different streams of events, it is occasionally useful to know the order these events are received in amongst each other

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @AdrianVovk@fosstodon.org

@AdrianVovk @chergert @swick @pwithnall local ordering per stream of calls. Generally the expectation though is that if you subscribe to sets of notifications you do this via a single call so that you get a single stream back. Hence what you want to subscribe to must be conceptually known at the moment where you implement a service, because you have to return a single stream of events with everything in it. But that should generally be fine, D-Bus kinda implies that too actually

0

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @pid_eins@mastodon.social

@pid_eins @AdrianVovk @swick @pwithnall

That only works when you're guaranteed to have framing data _before_ your argument data (which you are not).

2

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall so i take it you want to know the size of the next oncoming message first so that you can allocate a properly sized memory buffer for it and don't want to copy, or want to decode already while parsing? I think that's a bogus optimization attempt, because memory copying is the one thing that is ridiculously fast on modern cpus (unlike roundtrips for example which D-Bus is forcing so ridiculously many of). But more importantly what you theoretically gain..

2

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @chergert@my.devsuite.app

@pid_eins @AdrianVovk @swick @pwithnall

But if anyone is interested in running with it, here is where I put together a prototype but stopped after I realized I absolutely didn't want Varlink code in my applications.

It does a lot of the tricks I did to try to speed up GVariant for building typed messages.

https://gitlab.gnome.org/chergert/libgvarlink

0

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@chergert @AdrianVovk @swick @pwithnall .. here during msg input you'd lose during message output, because you have to marshall to text first, then determine the size so that you can write it our first. Its a zero sum game, and hence pointless...

0

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @pid_eins@mastodon.social

@pid_eins @AdrianVovk @swick @pwithnall

I certainly make _most_ of my apps bump at startup to the max. But there are a few cases where I can't (things that link against libimobiledevice for example).

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall in systemd we vet our deps, and if we see a dep that uses select() then it's pretty obviously not up to today's standards and not acceptable to use.

I am pretty sure usage of select() should be considered at bug in that library, just like any other bug. It's 2025 ffs.

1

0

0

Adrian Vovk

AdrianVovk@fosstodon.org

Reply to @pid_eins@mastodon.social

@pid_eins @chergert @swick @pwithnall I think the idea is to make it possible to separate the payload from the message. Strip off the framing and then forward it elsewhere

Over in an issue Christian gave an example along the lines of:

{"method": "DoSomething", ...}\0{"arg1": "asdf", "arg2": "foobar"}\0\0

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @AdrianVovk@fosstodon.org

@AdrianVovk @chergert @swick @pwithnall json has subobjects, its really nicely nestable. I doubt there is too much value in being able to split things on the byte level like this. I mean usually json objects are complex, and if you forward parts of a varlink message elsewhere then its highly unlikely you want to do that precisely at the toplevel message level rather than on some level below i.e. one of the passed args or even a part of a passed arg. Or to turn this around:

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@AdrianVovk @chergert @swick @pwithnall if you want all func params as one then it's more likely you also want the func name along with ot than not. Hence having a binary "cut" between method name and params seems weird and unneccessary to me.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@AdrianVovk @chergert @swick @pwithnall also having a binary cut vs a json level cut between finc name and params only is relevant if you assume marshalling is where you lose performance. But that's bs. You lose performance because of roundtrips (and thats so so so so bad on dbus) and on the strict ordering/strict single threading that dbus implies. Hence: dont lose yourself too much in marshalling questions, focus on the stuff that really matters, i.e. the more general structure of the IPC.

1

0

0

Adrian Vovk

AdrianVovk@fosstodon.org

Reply to @pid_eins@mastodon.social

@pid_eins @chergert @swick @pwithnall Welp, easier said than done. The dependency tree of $insert GUI app here is going to be a lot more complicated than systemd's. Though it probably wouldn't hurt to figure out which of our dependencies use select and then file issues

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @AdrianVovk@fosstodon.org

@AdrianVovk @pid_eins @swick @pwithnall

The libimobiledevice one was particularly tricky because it could get pulled in from GVFS and then all the other modules (which had no idea they were "linking" against it via dlopen()) go womp womp.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall if you really really want to use such a legacy library my suggestion would be to do it out of process, so it doesnt fuck up the rest of the codebase. And of course, if its open source just fix the damn libray. And if its not open source, then it should be out of process anyway...

0

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @pid_eins@mastodon.social

@pid_eins @AdrianVovk @swick @pwithnall

Yeah I agree, but there are libraries we still have out there that break it. And it's obnoxious now that we could be using 240 dma-buf / sec + the grip of them used in Mesa vulkan caches. We regularly bump up against 1024 in GTK apps now.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall any non-trivial program should bump soft rlimitnofile to hard rlimitnofile during early initialization and never look back.

(Glib should probably help with that and in particular auto-reset the soft limit for spawned processes back to 1024 for compat/safety.)

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@chergert @AdrianVovk @swick @pwithnall i am kinda tempted to add a thing to systemd where via bpf we automatically complain about use of legacy apis such as select() in the logs. As an attempt to push people to update their code...

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @pid_eins@mastodon.social

@pid_eins @AdrianVovk @swick @pwithnall

Let me be clear, I don't like D-Bus either.

But I don't want to go against the grain unless I can be certain that alternative direction is going to supplant the status-quo.

And this all feels very unfinished from the application API standpoint.

We've worked so hard in GTK 4 to focus on API ergonomics and work backwards to the technology. This feels quite the opposite.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

@chergert @AdrianVovk @swick @pwithnall my experience is quite the opposite. After having written both sd-bus and sd-varlink and very complex apps using both, and knowing both protocols quite well, i am very strongly of the opinion that writing varlink is so so so so much nicer and quicker and simpler. The fact how the number of varlink apis in systemd almost exploded in a very short time while our dbus apis kinda stagnate in their comprehensiveness is document to that.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@chergert @AdrianVovk @swick @pwithnall for the first time it feels like adding an ipc api to some low level os functionality is a 1h job from writing, over testing, to submitting, while for dbus it was more of a 1 *day* job if you were lucky, and more if not.

1

0

0

Christian Hergert

chergert@my.devsuite.app

Reply to @pid_eins@mastodon.social

@pid_eins @AdrianVovk @swick @pwithnall

What are the plans for observability? Do we need to create eBPF programs to snoop? If so, can we get any attributes on sockets intended for varlink?

I ask with both Sysprof and D-Spy hats on.

(Clearly we can't interject `tee` in the middle system-wide).

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @chergert@my.devsuite.app

Edited 3 months ago

@chergert @AdrianVovk @swick @pwithnall right now i trace varlink stuff mostly via simple strace. It's quite sufficient to me, since the messages are immediately readable, and transactions directly recognizable by their fd and isolated on syscall param level already. Hence so far I haven't needed more complex tracers as would be appropriate for a datagram based multipeer protocol such as dbus where message must be decoded before being readable and put into transactional context explictly.

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pid_eins@mastodon.social

@chergert @AdrianVovk @swick @pwithnall but yes, if we want varlink support for those kind of tracers we'd need to make varlink sockets recognizable somehow (xattrs would be perfect for that but right now kernel explicitly refuses xattrs on S_IFSOCK inodes), and then pick them up via a simple bpf prog.

0

0

0

Pavel Machek

pavel

Reply to @pid_eins@mastodon.social

@pid_eins @chergert @AdrianVovk @swick @pwithnall That would also warn about portable applications, so maybe don't...

1

0

0

Lennart Poettering

pid_eins@mastodon.social

Reply to @pavel

@pavel @AdrianVovk @swick @pwithnall @chergert poll() has been a pretty universal since a long time. It's in posix and all the bsds have it, and so does macosx since 2005. I think coverage is pretty much complete. If you run archaic software, sure you'd get these logs, but they'd just be logs, so can be ignored...

0

0

1

About social.kernel.org

Terms of service

Please do not use this service in violation of the Linux Kernel Code of Conduct. Doing so will result in your account suspension with the referral of the matter to the CoC committee.
"Repeating"/"boosting" someone else's status on this platform will be treated as endorsement and will fall under rule #1.
You are encouraged to use this platform to promote your work on the Linux Kernel, but there is no restriction on permitted topics (with the exception of anything covered by #1 above).
There is no requirement to post in English, but it should be considered the primary language of communication on this platform.

Privacy notice

The admins of this service have access to all posted statuses. They aren't looking, but if it's something they shouldn't know about, then you should not post it on this platform.

Please see the Linux Foundation Privacy Policy, which applies to this platform as well.

Getting your own account

If you would like an account on this instance, please check that the following applies to you:

You are listed in MAINTAINERS or CREDITS
OR: You have a kernel.org account or email address
OR: You have a long and established history of involvement with the Linux Kernel

If the above is true and you agree with the Terms of Service and Privacy Notice listed above, please use these instructions to request an account:

How to request an account on social.kernel.org