Conversation

The recent bugs-in-coreutils-in-rust kerfuffle has me thinking again about Future Shell Tools.

Ideally it would involve support for structured data. Probably json since it's so ubiquitous, but importantly negotiable to allow for other formats, including interop with classic plain-text tooling.

Also such structured data should include passing of file descriptors for more flexibility and avoidance of textual path reparsing footguns.

This is more interesting to me than rewriting "mv" and "cp".

7
2
0

@swetland

Are you inventing powershell from first principal here? šŸ˜…

0
0
0

@swetland @whitequark brb let me know when you've reinvented powershell XD

1
0
0

@malwareminigun @swetland powershell but it's not made by crackheads who think cat shouldn't work on binary files would be my fave shell

0
0
0

@rlonstein @swetland I was gonna raise PowerShell as another example. Also xonsh is cool -- just make Python nice to use as a shell, now everything can be Python objects if you want.

1
0
0

@wren6991 @rlonstein I mean apart from Windows and Python being two things that are the exact opposite of "sparks joy", and uncertainty about how "interop with existing tooling' works, okay.

1
0
0

@swetland Nushell has been my daily driver at $WORK for a few years, interop is not a problem.

1
0
0

@rlonstein Is there documentation that describes how the management and conveyance of type information is handled "on the wire" as it were?

I've skimmed the docs a bit and so far have not stumbled over any specifics on how this is implemented... how programs that want to emit or receive structured data indicate to nushell and/or other entities in the pipeline what format it is in, etc.

0
0
0

@swetland I want us to move away from delimiter based formats and offer CBOR instead, even though its binary

1
0
0

@chrisvest Yeah I'd absolutely prefer some kind of lightweight binary wire format rather than textual (JSON's constraint to Javascript's concept of numeric types feels clunky to me at the best of times), though being able to support both that and a human readable representation (at least at the tooling level) seems like an important feature to have.

Also I foolishly would like a format that's simple and inexpensive to parse or generate (JSON's at least not XML, but still could be better there).

0
0
0

@swetland sounds like nushell

0
0
0

@swetland

Somewhere, I have a small patch set for FreeBSD that adds two features.

The first is a content-negotiation protocol for pipes. This follows the model used for drag and drop and is backwards compatible if one end doesn’t support it. The sender does an ioctl to advertise a list of types that it supports. If the receiver does a read or a poll/select/kqueue wait, this returns an error indicating that the receiver does not support it. The receiver does an ioctl to receive the list of types and another to specify the desired type. The first ioctl similarly returns an error if the sender writes data to the pipe (i.e. they don’t support content negotiation). The sender then writes data according to the agreed type.

The second was ā€˜pipe peeling’ in the TTY layer. This let you get a second data channel to the terminal emulator (or whatever owns the server side of the tty), so you could send text for display on the normal terminal but also have a completely different stream (or more than one) for other data. You could use this to provide structured data for other rendering, accessibility data, and so on.

I also modified libxo, which a bunch of FreeBSD tools use to provide structured output, so that it would use both of these. If the standard output was a pipe, it would try content negotiation. If the standard output was a try, it would try to peel off a pipe and send structured output there as well as normal output to the tty.

I had a little demo that implemented a tiny tty server that mostly forwarded to the host terminal but accepted peeled pipes and asked for HTML that it would then send to a web browser to open, so anything that used libxo would automatically display pretty (sortable, filterable) output as well as the terminal output.

I had planned to add support for the additional channels in SSH (SSH supports additional data channels, they just needed wiring up), but I never got around to it.

There were some cleanups to do, but it gave you a very simple model for providing rich content between command-line tools (and between tools and the terminal emulator).

The entire diff was only around a hundred lines of code, implementing the same thing in multiple operating systems wouldn’t be very hard.

4
2
0

@david_chisnall @swetland structured object exchange between processes is the one feature I envy PowerShell users for ... you can certainly get a long way with one-JSON-object-per-line in bash but having to tiptoe around the horror that readline might do to your JSON blob and then deserialise it in bash is an utter pain ... unless bash now has a JSON parser grafted on somewhere I wasn't aware of ...

1
0
0

@mherbert @swetland

PowerShell has a bit of a cliff here. PowerShell cmdlets can exchange rich data because they're running in the same .NET VM. And that's not really different from different shell scripts that you source, except that shell scripts are a strong contender for the worst programming language ever designed (and I use the term in the loosest possible sense). Using something like Lua as your repl would have the same effect.

PowerShell's communication with external commands remains primitive. It defines a serialisation format for the data parts of .NET objects but there's still no negotiation flow: if something knows it's invoked by PowerShell, it can use that format to communicate with cmdlets.

The things I added to the pipe and tty layers would work with PowerShell as well: commands could peel a pipe from the tty layer and establish a communication channel with the shell for communicating CliXml objects (in both directions).

0
0
0

@david_chisnall I really like that simple approach to content negotiation. Seems very straightforward and doesn't require the entity setting up the pipe/socket to do anything special... so a "classic" shell can hook up two "modern" tools without needing any modification.

0
0
0

@david_chisnall @swetland i had a very similar experiment, but for the sake of expediency i initially added a "fourth standard fd" for the negotiation channel, the semantics otherwise sound quite similar. my demos mostly revolved around being able to easily turn random unstructured tables and flows into graphs, which is still far too annoying in pipelines today

1
0
0

@david_chisnall @swetland I tried to live with nushell for a while to get some of this in a mature setup, and the structured stuff it has can be nice. In the end I found it just isn't enough of a _shell_ for me. I wouldn't necessarily have predicted that I'd never give up expansions before giving nushell a real try, but the constant reordering things for interpolation instead I just never got over. nushell is also massive because it carries the kitchen sink inside the shell itself

2
0
0

@raggi @david_chisnall @swetland yeah the massive binary and depedency tree put me off. it feels like giving up something important

2
0
0

@whitequark @david_chisnall @swetland it absolutely is yeah, there are other problems with nushell too, in many ways they have a similar sniff to the uutils problems in that they're mostly "how someone who doesn't know yet thinks the shell works" kinds of issues - their windows port had the same too.

i don't really fault them for being different per se, but similar to the earlier comment i think the more interesting stuff comes when you go even more different

1
0
0

@whitequark @david_chisnall @swetland for example if i picked this line of stuff up again, i'm fairly sure i'd ditch the tty in the main interface. i want a UX that _feels similar_ to using a shell, but I want addressable, movable, gui-interactible outputs, i want incremental command refinement in a modern ux, i want richer input and output modes, etc. the part i want to retain is composition and you do need some common protocol for that - but that protocol needn't define the UX

1
0
0

@whitequark @raggi @david_chisnall David's approach is extremely appealing because it allows for support to be incrementally added to things rather than requiring an all-or-nothing shift to get the fancier behaviors.

It does have the friction of needing a kernel patch... so even if various kernel projects are willing to pick it up without a lot of deliberation and debate it'd take a while to percolate, but then again the magic of open source is you can actually have experimental kernel patches.

3
0
0
@raggi @david_chisnall @swetland

Not much interest on trying but if I tried anything I'd definitely see more point to try nu. With fish the reality is that it has probably some cool thing over zsh but I don't believe they would covert the "incompatibility price tag" :-) nu could probably be also run occasionally without it being too confusing to the brain exactly because it is so different.
0
0
0

@swetland @raggi @david_chisnall you could probably back-fill the ioctl using some sort of lightweight syscall emulation, or by abstracting it into a library and using a daemon as a stopgap

0
0
0

@swetland @whitequark @david_chisnall i agree. another approach i explored in my head a bit more recently (not tried any of this) was to have the shell abuse metadata in the programs (elf notes or similar) and then do the negotiation and leave an answer in env or similar.

2
0
0

@whitequark @raggi @david_chisnall I think you could actually do a version of this as a driver rather than a subsystem change (so it'd be really easy to build just a single little module rather than having to do a whole kernel rebuild). Basically have the driver just have a couple ioctls where you use the driver fd to invoke the content negotiation actions on a pipe fd you pass in, just one level of indirection.

1
0
0

@raggi @whitequark @david_chisnall I think not having the shell (or whatever entity sets up the pipeline) be aware in any way is a *major* advantage of David's approach, so I'd want to keep that.

Using a little driver or a service could definitely be workable.

Does linux have a mechanism to get some kind of unique entity ID from an fd these days? If so with the service/daemon approach you use the pipe fd as the key/token for the actions easily enough.

1
0
0

@swetland @whitequark @david_chisnall a hacky way to start would be to use sockets rather than pipes which have lots of side channel gunk already

1
0
0

@whitequark @raggi @david_chisnall On the userspace side code could try to open /dev/content-negotiation, try the direct ioctls (once they exist), and fall back to good 'ol plaintext if neither are available.

1
0
0

@raggi @swetland @david_chisnall you can use sockpair() with SCM_CREDENTIALS i think, which gives you a way to transmit an unforgeable (without CAP_SYS_ADMIN) pid

1
0
0

@whitequark @raggi @david_chisnall I was thinking of the fd entity ... so if two sides of a pipe or socket send a request to a daemon, the daemon can recognize that the fds it is passed are pointing to the same pipe or socket.

Though I think overall doing it as a (loadable) driver would be an easier way to provide an equivalent (minus the indirection) interface.

0
0
0

@swetland @whitequark @raggi

The down side of that approach is that it interacts poorly with sandboxing. Now every process that you might invoke interactively needs permission to access a new path.

0
0
0

@raggi @swetland @whitequark

The .NET CLI folks had a nice prototype while I was at MS. I’m not sure if it was ever published, but they had an extra section in the binary that contained a state machine (might have been CLR bytecode?) for command-line autocompletion.

PowerShell could pull this out and load it and then effectively get a cmdlet that wrapped the program. I think the goal was to make it possible to describe rich parameter types, so you’d be able to do things like use PowerShell variables containing structured data as command-line arguments and have them be type checked in the shell.

0
0
0

@david_chisnall @swetland oh, that is the proper version of some stuff I've been thinking of for a while.

Namely, a terminal that would also support programs that produced HTML output *or* that serve a web application. (So for example, an HTML top variant that serves a real-time web UI.)

(Likely then you could also extend that terminal to have browser functionalities, so "open https://example.com" would turn a tab into a browser.)

1
0
0

@coder @swetland

The kernel bit didn’t care what the types were. I initially used MIME types, but it would also be fine to say ā€˜this pipe is actually an HTTP interface now’ and provide a web interface.

I’d love to have that with the SSH tunneling so I could connect to a remote machine and then have web interfaces for administrative tools on a local web browser, without any faffing with port forwarding.

1
0
0

@david_chisnall @coder @swetland that made me recall this https://lobste.rs/s/mtatsi/unix_shell_programming_next_50_years#c_p4ojir

After closing the whole https://arcan-fe.com/2025/01/27/sunsetting-cursed-terminal-emulation/ saga and getting more hands on experience living mostly terminal-free since then, a recurring gripe is the dynamic side to "typed pipes".

UX for type-negotiation in clipboard and drag and drop is not good in general. Some of that is the WM designer's fault ('remember to press shift when pasting to exchange text/plain instead of application/rtf' or whatever'. Turning 'a | b | c' in the shell ecosystem into runtime type negotiation repeats the issue.

CLI evaluator can't know or specify which of the intersecting sets of types is desired and not inform the user about the possibilities or zero-match. You get cornered into probing dynamically or side-band databases again.

What I am curious about, and won't get around to being mostly burnt out on the whole shebang, is how another ELF section describing imports/exports and loader provided typeof(stdin) | typeof(stdout) would fare towards a 'a (html) | b (json) | c'' with feedback without execution.

0
0
0