On Thu, Mar 02, 2023 at 10:46:56AM +0100, David Hildenbrand wrote:
On 25.02.23 17:31, Peter Xu wrote:
[not for merging, but for discussion; this is something I found when
looking at another issue on Chuang's optimization for migration downtime]
Summary: we tried to access memory_listeners, address_spaces, etc. in RCU
way. However we didn't implement them with RCU-safety. This patchset is
trying to do that; at least making it closer.
NOTE! It's doing it wrongly for now, so please feel free to see this as a
thread to start discussing this problem, as in subject.
The core problem here is how to make sure memory listeners will be freed in
RCU ways, per when unlinking them from the global memory_listeners list.
Can you elaborate why we would want to do that? Is there a real reason we
cannot hold the BQL when unregistering a listener?
Yes afaict we must hold BQL when unregister any listener for now. I added
an explicit assert in patch 1 for that.
We want to do that because potentially we have RCU readers accessing these
two lists, so here taking BQL only is not enough. We need to release the
objects after all users are gone.
We already do that for address spaces, but afaict the listener part was
overlooked. The challenge here is how to achieve the same for listeners.
Or could we use any other, more fine-grained, lock to protect the memory
listeners?
Naive me would think that any interactions between someone updating the
memory listeners, and a listener getting removed, would require some careful
synchronization (to not rip a notifier out while someone else notifies --
what is the still registered notifier supposed to do with notifications
while it is already going away?), instead of doing it via RCU.
I'm all for using RCU if it improves performance and keeps things simple. If
RCU is neither required for performance reason and overcomplicates the
implementation, maybe using locking is the better choice.
For ASes, one major user RCU is memory_region_find_rcu().
For listeners, the only path that doesn't take BQL (afaict) is
memory_region_clear_dirty_bitmap(). Maybe you'll have some points here on
the side effect of taking it because it's in either virtio-mem or balloon
path for page hinting iirc.