qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v0 2/2] block: postpone the coroutine executing


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH v0 2/2] block: postpone the coroutine executing if the BDS's is drained
Date: Thu, 13 Sep 2018 10:44:50 +0200
User-agent: Mutt/1.9.1 (2017-09-22)

Am 12.09.2018 um 19:03 hat Denis V. Lunev geschrieben:
> On 09/12/2018 04:15 PM, Kevin Wolf wrote:
> > Am 12.09.2018 um 14:03 hat Denis Plotnikov geschrieben:
> >> On 10.09.2018 15:41, Kevin Wolf wrote:
> >>> Am 29.06.2018 um 14:40 hat Denis Plotnikov geschrieben:
> >>>> Fixes the problem of ide request appearing when the BDS is in
> >>>> the "drained section".
> >>>>
> >>>> Without the patch the request can come and be processed by the main
> >>>> event loop, as the ide requests are processed by the main event loop
> >>>> and the main event loop doesn't stop when its context is in the
> >>>> "drained section".
> >>>> The request execution is postponed until the end of "drained section".
> >>>>
> >>>> The patch doesn't modify ide specific code, as well as any other
> >>>> device code. Instead, it modifies the infrastructure of asynchronous
> >>>> Block Backend requests, in favor of postponing the requests arisen
> >>>> when in "drained section" to remove the possibility of request appearing
> >>>> for all the infrastructure clients.
> >>>>
> >>>> This approach doesn't make vCPU processing the request wait untill
> >>>> the end of request processing.
> >>>>
> >>>> Signed-off-by: Denis Plotnikov <address@hidden>
> >>> I generally agree with the idea that requests should be queued during a
> >>> drained section. However, I think there are a few fundamental problems
> >>> with the implementation in this series:
> >>>
> >>> 1) aio_disable_external() is already a layering violation and we'd like
> >>>     to get rid of it (by replacing it with a BlockDevOps callback from
> >>>     BlockBackend to the devices), so adding more functionality there
> >>>     feels like a step in the wrong direction.
> >>>
> >>> 2) Only blk_aio_* are fixed, while we also have synchronous public
> >>>     interfaces (blk_pread/pwrite) as well as coroutine-based ones
> >>>     (blk_co_*). They need to be postponed as well.
> >> Good point! Thanks!
> 
> Should we really prohibit all public interfaces, as they are reused
> inside block level?

We need to fix that. blk_*() should never be called from inside the BDS
layer.

> There is also a problem which is not stated in the clear words yet.
> We have potential deadlock in the code under the following
> conditions, which should be also taken into the consideration.
> 
> <path from the controller>
> bdrv_co_pwritev
>     bdrv_inc_in_flight
>     bdrv_aligned_pwritev
>         notifier_list_with_return_notify
>              backup_before_write_notify
>                  backup_do_cow
>                      backup_cow_with_bounce_buffer
>                          blk_co_preadv
> 
> Here blk_co_preadv() must finish its work before we will release the
> notifier and finish request initiated from the controller and which
> has incremented in-fligh counter.

Yes, before_write notifiers are evil. I've objected to them since the
day they were introduced and I'm surprised it's becoming a problem only
now.

We should probably change the backup job to insert a job node rather
sooner than later. Then it doesn't need to call blk_*() any more.

> Thus we should differentiate requests initiated at the controller
> level and requests initiated in the block layer.  This is sad but
> true.

The difference is supposed to be whether a request goes through a
BlockBackend or not.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]