Am 11.05.2020 um 18:50 hat Stefan Reiter geschrieben:
Just because we're in a coroutine doesn't imply ownership of the context
of the flushed drive. In such a case use the slow path which explicitly
enters bdrv_flush_co_entry in the correct AioContext.
Signed-off-by: Stefan Reiter <address@hidden>
---
We've experienced some lockups in this codepath when taking snapshots of VMs
with drives that have IO-Threads enabled (we have an async 'savevm'
implementation running from a coroutine).
Currently no reproducer for upstream versions I could find, but in testing this
patch fixes all issues we're seeing and I think the logic checks out.
The fast path pattern is repeated a few times in this file, so if this change
makes sense, it's probably worth evaluating the other occurences as well.
What do you mean by "owning" the context? If it's about taking the
AioContext lock, isn't the problem more with calling bdrv_flush() from
code that doesn't take the locks?
Though I think we have some code that doesn't only rely on holding the
AioContext locks, but that actually depends on running in the right
thread, so the change looks right anyway.