[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly speci
From: |
Fam Zheng |
Subject: |
Re: [Qemu-block] [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine |
Date: |
Mon, 10 Apr 2017 09:43:56 +0800 |
User-agent: |
Mutt/1.8.0 (2017-02-23) |
On Mon, 04/10 08:56, Paolo Bonzini wrote:
>
>
> On 07/04/2017 23:14, Kevin Wolf wrote:
> > One part of the reason is that
> > BDRV_POLL_WHILE() drops the lock since commit c9d1a561, so just calling
> > aio_context_acquire() doesn't protect you any more if there is any
> > chance that a nested function calls BDRV_POLL_WHILE().
>
> Hi guys, sorry for the late reply.
>
> There wasn't actually much that changed in commit c9d1a561. Everything
> that could break was also broken before.
>
> With commit c9d1a561 the logic is
>
> aio_context_acquire
> prepare I/O
> aio_context_release
> poll main AioContext
> aio_context_acquire
> aio_poll secondary AioContext
> complete I/O
> <-- bdrv_wakeup
> aio_context_acquire
> aio_context_release
> Sync I/O is complete
>
> while before it was
>
> aio_context_acquire
> prepare I/O
> aio_poll secondary AioContext
> complete I/O
> Sync I/O is complete
>
> The code that can run in "aio_poll secondary AioContext" is the same in
> both cases.
This is not completely true. Before this fact, a blocked aio_context_acquire()
in virtio_scsi_data_plane_handle_cmd() wouldn't be woken up during the sync I/O.
Of course the aio context pushdown (9d4566544) happened after c9d1a561, so this
is a compound consequence of the two.
Also, not polling for at least once in bdrv_drain_poll if
!bdrv_requests_pending(bs), since 997235485, made the situation a bit worse -
virtio_scsi_data_plane_handle_cmd could have returned before
bdrv_drained_begin() returns if we do the BDRV_POLL_WHILE body at least once
even if cond is false.
> The difference is that, after c9d1a561, it always runs in
> one thread which eliminates the need for RFifoLock's contention
> callbacks (and a bunch of bdrv_drain_all deadlocks that arose from the
> combination of contention callbacks and recursive locking).
>
> The patch that introduced the bug is the one that introduced the "home
> AioContext" co->ctx for coroutines, because now you have
>
> aio_context_acquire
> prepare I/O
> aio_context_release
> poll main AioContext
> aio_context_acquire
> aio_poll secondary AioContext
> aio_co_wake
> complete I/O
> bdrv_wakeup
> aio_context_acquire
> aio_context_release
> Sync I/O is complete
>
> and if "complete I/O" causes anything to happen in the iothread, bad
> things happen.
>
> I think Fam's analysis is right. This patch will hopefully be reverted
> in 2.10, but right now it's the right thing to do. However, I don't
> like very much adding an argument to qemu_coroutine_enter because:
>
> 1) coroutines can be used without AioContexts (CoMutex has a dependency
> on aio_co_wake, but it is hidden behind the API and if you have a single
> AioContext you could just define aio_co_wake to be qemu_coroutine_enter).
>
> 2) the value that is assigned to co->ctx is not the AioContext in which
> the coroutine is running.
>
> It would be better to split aio_co_wake so that aio_co_wake is
>
> /* Read coroutine before co->ctx. Matches smp_wmb in
> * qemu_coroutine_enter.
> */
> smp_read_barrier_depends();
> ctx = atomic_read(&co->ctx);
> aio_co_enter(ctx, co);
>
> and aio_co_enter is the rest of the logic.
Is aio_co_enter what solves 1) for you too? If so I can add it in v3.
Fam
>
> Then the coroutine code always runs in the right thread, which obviously
> is thread-safe.
>
> Paolo
>
- Re: [Qemu-block] [PATCH v2 4/6] block: Quiesce old aio context during bdrv_set_aio_context, (continued)
[Qemu-block] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine, Fam Zheng, 2017/04/07
[Qemu-block] [PATCH v2 6/6] tests/block-job-txn: Don't start block job before adding to txn, Fam Zheng, 2017/04/07
Re: [Qemu-block] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations, Kevin Wolf, 2017/04/07