qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback
Date: Fri, 14 Sep 2018 19:14:33 +0200
User-agent: Mutt/1.9.1 (2017-09-22)

Am 14.09.2018 um 17:12 hat Paolo Bonzini geschrieben:
> On 13/09/2018 18:59, Kevin Wolf wrote:
> > Am 13.09.2018 um 17:10 hat Paolo Bonzini geschrieben:
> >> On 13/09/2018 14:52, Kevin Wolf wrote:
> >>> + if (qemu_get_current_aio_context() == qemu_get_aio_context()) {
> >>> + /* If we are in the main thread, the callback is allowed to unref
> >>> + * the BlockBackend, so we have to hold an additional reference */
> >>> + blk_ref(acb->rwco.blk);
> >>> + }
> >>> acb->common.cb(acb->common.opaque, acb->rwco.ret);
> >>> + blk_dec_in_flight(acb->rwco.blk);
> >>> + if (qemu_get_current_aio_context() == qemu_get_aio_context()) {
> >>> + blk_unref(acb->rwco.blk);
> >>> + }
> >>
> >> Is this something that happens only for some specific callers?  That is,
> >> which callers are sure that the callback is invoked from the main thread?
> > 
> > I can't seem to reproduce the problem I saw any more even when reverting
> > the bdrv_ref/unref pair. If I remember correctly it was actually a
> > nested aio_poll() that was running a block job completion or something
> > like that - which would obviously only happen on the main thread because
> > the job intentionally defers to the main thread.
> > 
> > The only reason I made this conditional is that I think bdrv_unref()
> > still isn't safe outside the main thread, is it?
> 
> Yes, making it conditional is correct, but it is quite fishy even with
> the conditional.
> 
> As you mention, you could have a nested aio_poll() in the main thread,
> for example invoked from a bottom half, but in that case I'd rather
> track the caller that is creating the bottom half and see if it lacks a
> bdrv_ref/bdrv_unref (or perhaps it's even higher in the tree that is
> missing).

I went back to the commit where I first added the patch (it already
contained the ref/unref pair) and tried if I could reproduce a bug with
the pair removed. I couldn't.

I'm starting to think that maybe I was just overly cautious with the
ref/unref. I may have confused the nested aio_poll() crash with a
different situation. I've dealt with so many crashes and hangs while
working on this series that it's quite possible.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]