BTW in the last couple of days I did some more work beyond v4:
- Added a benchmark (not a correctness test) to measure parallel
performance of QHT (recall that test/qht-test is sequential.)
- Added support for concurrent insertions as long as they're not to the
same bucket, thus getting rid of the "external lock" requirement.
This is not really needed for MTTCG because all insertions are supposed
to be serialized by tb_lock; however, the feature (1) has no negative
performance impact (just adds an unlikely() branch after lock acquisition
on insertions/removals) and (2) could be useful for future (parallel)
users of qht.
Should I send this work as follow-up patches to v4 to ease review, or
should I send a v5 with them merged in?