Conversation

Jarkko Sakkinen

I made lsiommu as I just wanted to get rid of the shitty combination of bash and python I had before:

~/work/staging/lsiommu master*
❯ build/lsiommu
IOMMU Group 0
	00:07.1 Class 060400: Vendor 8086 Device 9a25 [8086:9a25] (rev 01)
IOMMU Group 1
	00:07.0 Class 060400: Vendor 8086 Device 9a23 [8086:9a23] (rev 01)
IOMMU Group 2
	00:02.0 Class 030000: Vendor 8086 Device 9a49 [8086:9a49] (rev 01)
IOMMU Group 3
	00:00.0 Class 060000: Vendor 8086 Device 9a14 [8086:9a14] (rev 01)
IOMMU Group 4
	00:04.0 Class 118000: Vendor 8086 Device 9a03 [8086:9a03] (rev 01)
IOMMU Group 5
	00:0a.0 Class 118000: Vendor 8086 Device 9a0d [8086:9a0d] (rev 01)
IOMMU Group 6
	00:0d.0 Class 0c0330: Vendor 8086 Device 9a13 [8086:9a13] (rev 01)
	00:0d.2 Class 0c0340: Vendor 8086 Device 9a1b [8086:9a1b] (rev 01)
IOMMU Group 7
	00:0e.0 Class 010400: Vendor 8086 Device 9a0b [8086:9a0b] (rev 00)
IOMMU Group 8
	00:14.0 Class 0c0330: Vendor 8086 Device a0ed [8086:a0ed] (rev 20)
	00:14.2 Class 050000: Vendor 8086 Device a0ef [8086:a0ef] (rev 20)
IOMMU Group 9
	00:14.3 Class 028000: Vendor 8086 Device a0f0 [8086:a0f0] (rev 20)
IOMMU Group 10
	00:15.0 Class 0c8000: Vendor 8086 Device a0e8 [8086:a0e8] (rev 20)
IOMMU Group 11
	00:16.0 Class 078000: Vendor 8086 Device a0e0 [8086:a0e0] (rev 20)
IOMMU Group 12
	00:1d.0 Class 060400: Vendor 8086 Device a0b0 [8086:a0b0] (rev 20)
IOMMU Group 13
	00:1f.0 Class 060100: Vendor 8086 Device a082 [8086:a082] (rev 20)
	00:1f.3 Class 040100: Vendor 8086 Device a0c8 [8086:a0c8] (rev 20)
	00:1f.4 Class 0c0500: Vendor 8086 Device a0a3 [8086:a0a3] (rev 20)
	00:1f.5 Class 0c8000: Vendor 8086 Device a0a4 [8086:a0a4] (rev 20)
IOMMU Group 14
	55:00.0 Class 010802: Vendor 144d Device a808 [144d:a808] (rev 00)

Perhaps the most interesting implementation note is that it uses libudev for PCI discovery, instead of traversing sysfs (because the latter sucks).

Right and I made my own shitty teardown manager framwork:

/* SPDX-License-Identifier: GPL-3.0-or-later */
/*
 * Copyright(c) Opinsys Oy 2025
 */

#ifndef TEARDOWN_H
#define TEARDOWN_H

#include <libudev.h>

#define teardown(func) __attribute__((cleanup(func)))

void teardown_udev(struct udev **udev);
void teardown_udev_device(struct udev_device **dev);
void teardown_udev_enumerate(struct udev_enumerate **enumerate);

#endif /* TEARDOWN_H */

Dependencies:

❯ ldd build/lsiommu 
	linux-vdso.so.1 (0x00007f083ccd5000)
	libargtable2.so.0 => /lib/x86_64-linux-gnu/libargtable2.so.0 (0x00007f083cc8a000)
	libudev.so.1 => /lib/x86_64-linux-gnu/libudev.so.1 (0x00007f083cc5c000)
	libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 (0x00007f083cb8c000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f083c9ab000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f083ccd7000)
	libcap.so.2 => /lib/x86_64-linux-gnu/libcap.so.2 (0x00007f083c99f000)
	libgcrypt.so.20 => /lib/x86_64-linux-gnu/libgcrypt.so.20 (0x00007f083c856000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f083c827000)
	libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f083c76b000)
	liblz4.so.1 => /lib/x86_64-linux-gnu/liblz4.so.1 (0x00007f083c745000)
	libgpg-error.so.0 => /lib/x86_64-linux-gnu/libgpg-error.so.0 (0x00007f083c71d000)

I can throw this to some Git repository if anyone is interested any of this. It’s really just “by me for me”, but I neither mind sharing it.

#linux #kernel #iommu

2
2
2
Deps where "bookworm optimized", i.e. I used that debian version as the lowest common denominator for the deps.

#debian
0
0
0

Jarkko Sakkinen

❯ wc -l *.c *.h
  317 iommu.c
   31 log.c
   70 main.c
   32 teardown.c
   26 iommu.h
   16 log.h
   11 main.h
   20 teardown.h
  523 total

Not too bad considering that iommu.c has a heap tree and radix sort implementation (I dislike qsort for anything really)

RE: https://social.kernel.org/objects/96e13d6c-6be2-4180-9bbc-f4e3fbd6a38b

1
0
1

first valgrind pass i tried also looked pretty decent:

==425020==
==425020== HEAP SUMMARY:
==425020==     in use at exit: 0 bytes in 0 blocks
==425020==   total heap usage: 2,143 allocs, 2,143 frees, 786,679 bytes allocated
==425020==
==425020== All heap blocks were freed -- no leaks are possible
==425020==
==425020== For lists of detected and suppressed errors, rerun with: -s
==425020== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

was a bit surprise tho :-)

0
0
0

@jarkko
This is the weirdest by-catch I've got from following the hashtag! Does it show which memory areas are accessible to which device? Can it be used to find anomalies and incorrect configurations? I'd love to inspect the state of my IOMMU!

1
0
1
@ge0rg in technical sense you can already do what my tool does with lspci -vmm, i.e. get IOMMU groups:

❯ lspci -vmm|head -25
Slot: 00:00.0
Class: Host bridge
Vendor: Intel Corporation
Device: 11th Gen Core Processor Host Bridge/DRAM Registers
SVendor: Hewlett-Packard Company
SDevice: 11th Gen Core Processor Host Bridge/DRAM Registers
Rev: 01
ProgIf: 00
IOMMUGroup: 3

Slot: 00:02.0
Class: VGA compatible controller
Vendor: Intel Corporation
Device: TigerLake-LP GT2 [Iris Xe Graphics]
SVendor: Hewlett-Packard Company
SDevice: TigerLake-LP GT2 [Iris Xe Graphics]
Rev: 01
ProgIf: 00
IOMMUGroup: 2

Slot: 00:04.0
Class: Signal processing controller
Vendor: Intel Corporation
Device: TigerLake-LP Dynamic Tuning Processor Participant
SVendor: Hewlett-Packard Company
# ...

But this is not very productive if you are interested IOMMU groups first and PCI devices second.

After a bit of tuning I ended up to "lsusb style" dump:

❯ lsiommu
Group 000 00:07.1 Class 060400: Vendor 8086 Device 9a25 [8086:9a25] (rev 01)
Group 001 00:07.0 Class 060400: Vendor 8086 Device 9a23 [8086:9a23] (rev 01)
Group 002 00:02.0 Class 030000: Vendor 8086 Device 9a49 [8086:9a49] (rev 01)
Group 003 00:00.0 Class 060000: Vendor 8086 Device 9a14 [8086:9a14] (rev 01)
Group 004 00:04.0 Class 118000: Vendor 8086 Device 9a03 [8086:9a03] (rev 01)
Group 005 00:0a.0 Class 118000: Vendor 8086 Device 9a0d [8086:9a0d] (rev 01)
Group 006 00:0d.0 Class 0c0330: Vendor 8086 Device 9a13 [8086:9a13] (rev 01)
Group 006 00:0d.2 Class 0c0340: Vendor 8086 Device 9a1b [8086:9a1b] (rev 01)
Group 007 00:0e.0 Class 010400: Vendor 8086 Device 9a0b [8086:9a0b] (rev 00)
Group 008 00:14.0 Class 0c0330: Vendor 8086 Device a0ed [8086:a0ed] (rev 20)
Group 008 00:14.2 Class 050000: Vendor 8086 Device a0ef [8086:a0ef] (rev 20)
Group 009 00:14.3 Class 028000: Vendor 8086 Device a0f0 [8086:a0f0] (rev 20)
Group 010 00:15.0 Class 0c8000: Vendor 8086 Device a0e8 [8086:a0e8] (rev 20)
Group 011 00:16.0 Class 078000: Vendor 8086 Device a0e0 [8086:a0e0] (rev 20)
Group 012 00:1d.0 Class 060400: Vendor 8086 Device a0b0 [8086:a0b0] (rev 20)
Group 013 00:1f.0 Class 060100: Vendor 8086 Device a082 [8086:a082] (rev 20)
Group 013 00:1f.3 Class 040100: Vendor 8086 Device a0c8 [8086:a0c8] (rev 20)
Group 013 00:1f.4 Class 0c0500: Vendor 8086 Device a0a3 [8086:a0a3] (rev 20)
Group 013 00:1f.5 Class 0c8000: Vendor 8086 Device a0a4 [8086:a0a4] (rev 20)
Group 014 55:00.0 Class 010802: Vendor 144d Device a808 [144d:a808] (rev 00)

I don't want to stringify classes or other elements right now because this output always fits the screen, which matters to me.
1
0
1
@ge0rg it does not do whole alot that you could not do with trivial scripting. i had such a script for 10 years with bash and python but i wanted to consolidate that into C program to get a baseline that can be extended to do more complex things in future. and i needed to learn to use libudev properly for something more serious so this was a good exercise to test run it (my original script traversed sysfs).
1
0
1