The guy who wrote the epoch library at $WORK decided to tag garbage with the local epoch rather than the global epoch, for efficiency. So, you need 4 epochs rather than 3 (since a local epoch can lag at most 1 behind the global epoch, it's safe to deallocate garbage tagged with the local epoch when its epoch is 3 behind the new global epoch, rather than 2 behind as usual). I had not seen this approach mentioned anywhere else, which I found strange since it seems like an obvious optimization. However, today I saw it mentioned in discussions for the `crossbeam-epoch` crate:
https://github.com/crossbeam-rs/crossbeam/pull/416
https://github.com/crossbeam-rs/crossbeam/issues/238#issuecomment-525795655
@tobinbaker counting epochs mod 4 is already a classic. Tagging with TSC is pretty nice too, to avoid contention on the global epoch.
@pkhuong not surprised it's folklore, I just don't remember seeing it in any papers. any references for using TSC as an epoch?
@tobinbaker Yeah, the papers seem to all be about the minimum number of epochs, instead of trying to do nice things with masking out high bits and seeing what comes out. Re TSC, I remember discussing that with Parmer ca 2016 https://www2.seas.gwu.edu/~gparmer/pubs.html
@pkhuong I guess it would be simple enough to use TSC for hazard eras or IBR
@pkhuong @tobinbaker Opened up one of the papers and saw "SMR (Scalable Memory Reclamation)" which is a funny backronym. Although I've always thought "Safe Memory Reclamation" was a silly term and wish we all just agreed to call it concurrent memory reclamation or whatever.
@pkhuong I did find a "TSC-IBR" paper but while the idea of localizing the pointer read in time is nifty I don't understand how a store-load barrier isn't necessary after publishing the TSC value since they don't have any asymmetric barrier on the GC side.
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=11000280
@pkhuong also found an interesting paper that hybridizes HP and EBR but in a different way than HE/IBR: all protected pointers are added to a thread-local list but not immediately published, while being protected by EBR, then when a stalled thread is detected, its pointer list is converted to HP protection and the epoch is advanced without the thread's cooperation. Lots of details I haven't absorbed yet (e.g. over-approximating pointer lists with bloom filters), but the idea seems promising and they implemented it in crossbeam.
@tobinbaker @pkhuong FWIW, those KAIST researchers are also the maintainers of crossbeam that you linked to earlier. They also have an interesting benchmark here: https://github.com/kaist-cp/smr-benchmark/