@datenwolf Ok, time to ask the lists:
Hi!
It seems that DMA-BUFs are always uncached on arm64... which is a
problem.
I'm trying to get useful camera support on Librem 5, and that includes
recording vidos (and taking photos).
memcpy() from normal memory is about 2msec/1MB. Unfortunately, for
DMA-BUFs it is 20msec/1MB, and that basically means I can't easily do
760p video recording. Plus, copying full-resolution photo buffer takes
more than 200msec!
There's possibility to do some processing on GPU, and its implemented here:
https://gitlab.com/tui/tui/-/tree/master/icam?ref_type=headsbut that hits the same problem in the end -- data is in DMA-BUF,
uncached, and takes way too long to copy out.
And that's ... wrong. DMA ended seconds ago, complete cache flush
would be way cheaper than copying single frame out, and I still have
to deal with uncached frames.
So I have two questions:
1) Is my analysis correct that, no matter how I get frame from v4l and
process it on GPU, I'll have to copy it from uncached memory in the
end?
2) Does anyone have patches / ideas / roadmap how to solve that? It
makes GPU unusable for computing, and camera basically unusable for
video.
Best regards,
Pavel