@jarkko I see you've taken the 'it's Sunday' thing seriously, especially 'it's Sunday after a conference you spoke at' honestly bro come on.
It's not a lack of scalability, I explained why we CANNOT MERGE these. We're not 'failing to scale', your driver is choosing to do something that does not 'scale'.
To repeat - the PFN mappings can map any set of PFNs to the VMA virtual range (see remap_pfn_range_notrack()) - so on what basis could we merge adjacent VM_PFNMAP VMAs?
A VMA is a virtually contiguous memory range with the same attributes - but if we merged these (without walking page tables) this would no longer be the case. This would be 'somewhat' surprising for anybody who mapped distinct PFN ranges in adjacent mappings.
Keep in mind the prot bits of the mapping can vary as well as PFN ranges.
Walking the page tables and actually checking this might be possible, but it'd cause huge latency when you map any adjacent or overlapping VMAs. Encoding the data in the VMA would be fraught and fragile and then there's concerns about locking and races here...
To me that latency thing is what kills it.
Your driver seems to be mapping a bunch of stuff immediately adjacent to each other and presumably you are implying with equal prot and immediately adjacent PFNs? I mean if not, then you SHOULD NOT merge anyway.
The .close() thing is obviously bogus when you're dealing with special VMAs though semantics around .close() handling has changed with my series (in practical pretty much identical behaviour after Vlasta's optimisation on this).
Have you actually hit some kind of issue as a result of this? What 'lack of scalability' is the issue here?
It's not something I'm going to dive into, because I don't believe there's an issue here, and you're bugging me on a Sunday about it, which is making me grumpy now.