This was pretty cool:
❯ sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }'
Attaching 3 probes...
^C
@[
tpm_transmit_cmd+46
tpm2_flush_context+120
tpm2_commit_space+197
tpm_dev_transmit.constprop.0+137
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 2860677
@[
tpm_dev_transmit.constprop.0+111
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/16:1]: 3890693
@[
tpm_transmit_cmd+46
tpm2_load_context+195
tpm2_prepare_space+410
tpm_dev_transmit.constprop.0+54
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 9058524
@[
tpm_transmit_cmd+46
tpm2_save_context+179
tpm2_commit_space+314
tpm_dev_transmit.constprop.0+137
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 11426260
@[
tpm_transmit_cmd+46
tpm2_load_context+195
tpm2_prepare_space+318
tpm_dev_transmit.constprop.0+54
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 14182972
@[
tpm_transmit_cmd+46
tpm2_save_context+179
tpm2_commit_space+155
tpm_dev_transmit.constprop.0+137
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 22597059
@[
tpm_dev_transmit.constprop.0+111
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 1958500581
Gives me total ns time for each possible stack while bpftrace was running :-) Nothing spectacular but I believe this might be enough to get hold of a performance regression:
https://lore.kernel.org/linux-integrity/D43JXBFOOB2O.3U6ZQ7DASR1ZW@kernel.org/
I’m a total beginner with eBPF stuff, and not an expert in tracing and profiling, so any improvement suggestions are welcome.