[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine |
Date: |
Mon, 10 Apr 2017 08:56:00 +0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 |
On 07/04/2017 23:14, Kevin Wolf wrote:
> One part of the reason is that
> BDRV_POLL_WHILE() drops the lock since commit c9d1a561, so just calling
> aio_context_acquire() doesn't protect you any more if there is any
> chance that a nested function calls BDRV_POLL_WHILE().
Hi guys, sorry for the late reply.
There wasn't actually much that changed in commit c9d1a561. Everything
that could break was also broken before.
With commit c9d1a561 the logic is
aio_context_acquire
prepare I/O
aio_context_release
poll main AioContext
aio_context_acquire
aio_poll secondary AioContext
complete I/O
<-- bdrv_wakeup
aio_context_acquire
aio_context_release
Sync I/O is complete
while before it was
aio_context_acquire
prepare I/O
aio_poll secondary AioContext
complete I/O
Sync I/O is complete
The code that can run in "aio_poll secondary AioContext" is the same in
both cases. The difference is that, after c9d1a561, it always runs in
one thread which eliminates the need for RFifoLock's contention
callbacks (and a bunch of bdrv_drain_all deadlocks that arose from the
combination of contention callbacks and recursive locking).
The patch that introduced the bug is the one that introduced the "home
AioContext" co->ctx for coroutines, because now you have
aio_context_acquire
prepare I/O
aio_context_release
poll main AioContext
aio_context_acquire
aio_poll secondary AioContext
aio_co_wake
complete I/O
bdrv_wakeup
aio_context_acquire
aio_context_release
Sync I/O is complete
and if "complete I/O" causes anything to happen in the iothread, bad
things happen.
I think Fam's analysis is right. This patch will hopefully be reverted
in 2.10, but right now it's the right thing to do. However, I don't
like very much adding an argument to qemu_coroutine_enter because:
1) coroutines can be used without AioContexts (CoMutex has a dependency
on aio_co_wake, but it is hidden behind the API and if you have a single
AioContext you could just define aio_co_wake to be qemu_coroutine_enter).
2) the value that is assigned to co->ctx is not the AioContext in which
the coroutine is running.
It would be better to split aio_co_wake so that aio_co_wake is
/* Read coroutine before co->ctx. Matches smp_wmb in
* qemu_coroutine_enter.
*/
smp_read_barrier_depends();
ctx = atomic_read(&co->ctx);
aio_co_enter(ctx, co);
and aio_co_enter is the rest of the logic.
Then the coroutine code always runs in the right thread, which obviously
is thread-safe.
Paolo
Re: [Qemu-devel] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations, Kevin Wolf, 2017/04/07