Conversation

Asahi Lina (朝日リナ) // nullptr::live

Edited 3 months ago

@corbet

König is correct, in that these problems do not (hopefully) happen with existing, in-tree graphics drivers.

No, he isn't. I think you missed the email where I pointed out how the codepaths in the first in-tree C driver I looked at (panfrost) suffered from the same exact bugs. The only reason this isn't actively blowing up on the daily for users is that in-tree drivers mostly use a global scheduler and only tear it down on device removal, so any potential problems can only happen when you unbind/unplug the GPU, which is only relevant in practice for eGPU users outside of testing scenarios.

In other words, the existing C code is just as broken as Rust would be without the fixups I sent, it just happens to hit the breakage less often due to other driver design differences. The problem is not just that the interface cannot be reasonably used safely from Rust, it's that it reasonably can't be used safely from C either, since users don't understand and don't uphold the undocumented requirements for it to be. C developers just get away with being blissfully unaware of the mistakes until the code actually crashes, because the language doesn't force them to consider these things upfront as part of its syntax.

(Apparently I can't comment without a LWN subscription... Edit: Managed to get one so I commented directly on the article.)

2
1
5

Asahi Lina (朝日リナ) // nullptr::live

@corbet

Never mind that there are actual, serious memory safety bugs in drm_sched that affect all drivers since they have nothing to do with the usage pattern of the driver. I tracked that one down recently, but it's just more evidence that the design of this thing is just poor, not just in terms of API, but also its internal architecture: Even the maintainers themselves can't maintain code quality and stop memory safety bugs from creeping into drm_sched, because it's so difficult to understand its architecture in practice.

As for Nova, there is zero chance of that being upstreamed before drm/asahi. They are literally using my abstractions and depend on more abstraction work (all of KMS), plus the core of the driver doesn't even exist yet. drm/asahi is essentially ready to merge as soon as platform driver abstractions exist and I pick up some bits and pieces from DRM up again. I'm just waiting for all the core driver/device framework stuff to land first because that is taking way longer than I expected, and I'm too burned out of random kernel maintainers to take on that work myself (reading some of the upstreaming threads for related stuff is quite honestly painful, I'm glad some other people are taking on that work now). Some of Nova may make it upstream first, but I guarantee it won't be a functional driver before drm/asahi is merged as a whole, unless something goes horribly wrong.

0
1
1
@lina As far as I can tell, you do have an LWN subscription. I definitely encourage you to add your point of view there.

This was one of those articles where the best I can hope for is that everybody is equally mad at me... can I go write about memory tiering now?
1
0
10

Asahi Lina (朝日リナ) // nullptr::live

@corbet I just got one right now, just commented there a few minutes ago.

0
0
0