Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext befor

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext befor

From:	Sergio Lopez
Subject:	Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context
Date:	Wed, 06 Mar 2019 13:47:19 +0100
User-agent:	mu4e 1.0; emacs 26.1

Kevin Wolf writes:

> Am 01.03.2019 um 14:47 hat Sergio Lopez geschrieben:
>> >> Otherwise, we can simply add an extra condition at
>> >> child_job_drained_poll(), before the drv->drained_poll(), to return
>> >> true if the job isn't yet paused.
>> >
>> > Yes, I think something like this is this right fix.
>> 
>> Fixing this has uncovered another issue also triggered by issuing
>> 'block_commit' and 'device_del' consecutively. At the end, mirror_run()
>> calls to bdrv_drained_begin(), which is scheduled of later (via
>> bdrv_co_yield_to_drain()) as the mirror job is running in a coroutine.
>> 
>> At the same time, the Guest requests the device to be unplugged, which
>> leads to blk_unref()->blk_drain()->bdrv_do_drained_begin(). When the
>> latter reaches BDRV_POLL_WHILE, the bdrv_drained_begin scheduled above
>> is run, which also runs BDRV_POLL_WHILE, leading to the thread getting
>> stuck in aio_poll().
>> 
>> Is it really safe scheduling a bdrv_drained_begin() with poll == true?
>
> I don't see what the problem would be with it in theory. Once the node
> becomes idle, both the inner and the outer BDRV_POLL_WHILE() should
> return.
>
> The question with such hangs is usually, what is the condition that made
> bdrv_drain_poll() return true, and why aren't we making progress so that
> is would become false. With iothreads, it could also be that the
> condition has actually already changed, but aio_wait_kick() wasn't
> called, so aio_poll() isn't woken up.

Turns out we can't restrict child_job_drained_poll() to signal
completion only if the job has already been effectively paused or
cancelled, as we may reach this point from job_finish_sync().

Do you think it's worth to keep trying that bdrv_drained_begin() only
returns when the related jobs are completely paused, or should we just
use AIO_WAIT_WHILE at block_job_detach_aio_context() as previously
suggested?

Thanks,
Sergio (slp).

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context, Kevin Wolf, 2019/03/01
- Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context, Sergio Lopez, 2019/03/01
  - Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context, Kevin Wolf, 2019/03/01
    - Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context, Sergio Lopez <=

Prev by Date: Re: [Qemu-block] [Qemu-devel] [PATCH v2] file-posix: add drop-cache=on|off option
Next by Date: Re: [Qemu-block] [Qemu-devel] [PATCH v2] file-posix: add drop-cache=on|off option
Previous by thread: Re: [Qemu-block] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context
Next by thread: Re: [Qemu-block] [PATCH 09/13] block: Add 'runtime_opts' and 'mutable_opts' fields to BlockDriver
Index(es):
- Date
- Thread