qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC 0/9] block: Rewrite block drain begin/end


From: Fam Zheng
Subject: Re: [Qemu-devel] [PATCH RFC 0/9] block: Rewrite block drain begin/end
Date: Thu, 30 Nov 2017 22:34:30 +0800
User-agent: Mutt/1.9.1 (2017-09-22)

On Thu, 11/30 11:31, Kevin Wolf wrote:
> What you really mean is probably connected components in the graph, but
> do we really want to manage merging and splitting object representing
> connected components when a node is added or removed from the graph?
> Especially when that graph change occurs in a drain callback?
> 
> You can also still easily introduce bugs where graph changes during a
> drain end up in nodes not being drained, possibly drained twice, you
> still access the next pointer of a deleted node or you accidentally
> switch to draining a different component.
> 
> It's probably possible to get this right, but essentially you're just
> switching from iterating a tree to iterating a list. You get roughly the
> same set of problems that you have to consider as today, and getting it
> right should be about the same difficulty.

Not quite. The essential idea is redo the drain begin/end in a correct way:

bdrv_drained_begin(bs):
    1.a) stop all sources of new requests in the connected components, including
        * aio_disable_external()
        * stop QED timer
        * stop block job
    1.b) aio_poll() until:
        * no request is in flight for bs
        * no progress is made (any progress may generate new requests)

bdrv_drained_end(bs):
    2.a) resume stopped sources of new requests
    2.b) no need to do aio_poll() because the main loop will move on

1.a) can be either done with either a graph recursion like now, or a list
traversing if managed somewhere, like in this series. Or better, BlockGraph will
be the manager. The rule is that these operations will not cause graph change,
so it's an improvement.

1.b) doesn't need recursion IMO, only looking at bs->in_flight and the return
value of aio_poll() should be enough.

For 2.a) if the drain begin/end callbacks are mananged in a list, it's easy to
only call drained_end() for entries that got drained_begin() earlier, as shown
in patch 2.

> 
> > > And finally, I don't really think that the recursion is even a problem.
> > > The problem is with graph changes made by callbacks that drain allows to
> > > run. With your changes, it might be a bit easier to avoid that
> > > bdrv_drain() itself gets into trouble due to graph changes, but this
> > > doesn't solve the problem for any (possibly indirect) callers of
> > > bdrv_drain().
> > 
> > The recursion is the one big place that can be easily broken by graph 
> > changes,
> > fixing this doesn't make the situation any worse. We could still fix the
> > indirect callers by taking references or by introducing "ubiquitous 
> > coroutines".
> 
> But hiding a bug in 80% of the cases where it shows isn't enough.

I think they are separate bugs. And with the one that this series is fixing,
others (bdrv_drain*() callers') may even not show up, because
bdrv_drained_begin() crashed already.

> 
> I think the only real solution is to forbid graph changes until after
> any critical operation has completed. I haven't tried it out in
> practice, but I suppose we could use a CoMutex around them and take it
> in bdrv_drained_begin/end() and all other places that can get into
> trouble with graph changes.

Yes, I agree, but that (using CoMutex around graph change) requires everything,
especially the defer_to_main_loop_bh, runs in a coroutine context, which is
exactly what I mean by "introducing 'ubiquitous coroutines'", because currently
we don't have them.

Fam



reply via email to

[Prev in Thread] Current Thread [Next in Thread]