qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 28/31] block.c: assert BQL lock held in bdrv_co_invalidate


From: Kevin Wolf
Subject: Re: [PATCH v5 28/31] block.c: assert BQL lock held in bdrv_co_invalidate_cache
Date: Wed, 19 Jan 2022 19:34:42 +0100

Am 20.12.2021 um 13:20 hat Emanuele Giuseppe Esposito geschrieben:
> 
> 
> On 17/12/2021 17:38, Emanuele Giuseppe Esposito wrote:
> > 
> > 
> > On 17/12/2021 12:04, Hanna Reitz wrote:
> > > On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
> > > > bdrv_co_invalidate_cache is special: it is an I/O function,
> > > 
> > > I still don’t believe it is, but well.
> > > 
> > > (Yes, it is called by a test in an iothread, but I believe we’ve
> > > seen that the tests simply sometimes test things that shouldn’t be
> > > allowed.)
> > > 
> > > > but uses the block layer permission API, which is GS.
> > > > 
> > > > Because of this, we can assert that either the function is
> > > > being called with BQL held, and thus can use the permission API,
> > > > or make sure that the permission API is not used, by ensuring that
> > > > bs (and parents) .open_flags does not contain BDRV_O_INACTIVE.
> > > > 
> > > > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> > > > ---
> > > >   block.c | 26 ++++++++++++++++++++++++++
> > > >   1 file changed, 26 insertions(+)
> > > > 
> > > > diff --git a/block.c b/block.c
> > > > index a0309f827d..805974676b 100644
> > > > --- a/block.c
> > > > +++ b/block.c
> > > > @@ -6574,6 +6574,26 @@ void bdrv_init_with_whitelist(void)
> > > >       bdrv_init();
> > > >   }
> > > > +static bool bdrv_is_active(BlockDriverState *bs)
> > > > +{
> > > > +    BdrvChild *parent;
> > > > +
> > > > +    if (bs->open_flags & BDRV_O_INACTIVE) {
> > > > +        return false;
> > > > +    }
> > > > +
> > > > +    QLIST_FOREACH(parent, &bs->parents, next_parent) {
> > > > +        if (parent->klass->parent_is_bds) {
> > > > +            BlockDriverState *parent_bs = parent->opaque;
> > > 
> > > This looks like a really bad hack to me.  We purposefully have made
> > > the parent link opaque so that a BDS cannot easily reach its
> > > parents.  All accesses should go through BdrvChildClass methods.
> > > 
> > > I also don’t understand why we need to query parents at all.  The
> > > only fact that determines whether the current BDS will have its
> > > permissions changed is whether the BDS itself is active or
> > > inactive.  Sure, we’ll invoke bdrv_co_invalidate_cache() on the
> > > parents, too, but then we could simply let the assertion fail there.
> > > 
> > > > +            if (!bdrv_is_active(parent_bs)) {
> > > > +                return false;
> > > > +            }
> > > > +        }
> > > > +    }
> > > > +
> > > > +   return true;
> > > > +}
> > > > +
> > > >   int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState
> > > > *bs, Error **errp)
> > > >   {
> > > >       BdrvChild *child, *parent;
> > > > @@ -6585,6 +6605,12 @@ int coroutine_fn
> > > > bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp)
> > > >           return -ENOMEDIUM;
> > > >       }
> > > > +    /*
> > > > +     * No need to muck with permissions if bs is active.
> > > > +     * TODO: should activation be a separate function?
> > > > +     */
> > > > +    assert(qemu_in_main_thread() || bdrv_is_active(bs));
> > > > +
> > > 
> > > I don’t understand this, really.  It looks to me like “if you don’t
> > > call this in the main thread, this better be a no-op”, i.e., you
> > > must never call this function in an I/O thread if you really want to
> > > use it.  I.e. what I’d classify as a GS function.
> > > 
> > > It sounds like this is just a special case for said test, and
> > > special-casing code for tests sounds like a bad idea.
> > 
> > Ok, but trying to leave just the qemu_in_main_thread() assertion makes
> > test 307 (./check 307) fail.
> > I am actually not sure on why it fails, but I am sure it is because of
> > the assertion, since without it it passes.
> > 
> > I tried with gdb (./check -gdb 307 on one terminal and
> > gdb -iex "target remote localhost:12345"
> > in another) but it points me to this below, which I think is the ndb
> > server getting the socket closed (because on the other side it crashed),
> > and not the actual error.
> > 
> > 
> > Thread 1 "qemu-system-x86" received signal SIGPIPE, Broken pipe.
> > 0x00007ffff68af54d in sendmsg () from target:/lib64/libc.so.6
> > (gdb) bt
> > #0  0x00007ffff68af54d in sendmsg () from target:/lib64/libc.so.6
> > #1  0x0000555555c13cc9 in qio_channel_socket_writev (ioc=<optimized
> > out>, iov=0x5555569a4870, niov=1, fds=0x0, nfds=<optimized out>,
> > errp=0x0)
> >      at ../io/channel-socket.c:561
> > #2  0x0000555555c19b18 in qio_channel_writev_full_all
> > (ioc=0x55555763b800, iov=iov@entry=0x7fffe8dffd80, niov=niov@entry=1,
> > fds=fds@entry=0x0,
> >      nfds=nfds@entry=0, errp=errp@entry=0x0) at ../io/channel.c:240
> > #3  0x0000555555c19bd2 in qio_channel_writev_all (errp=0x0, niov=1,
> > iov=0x7fffe8dffd80, ioc=<optimized out>) at ../io/channel.c:220
> > #4  qio_channel_write_all (ioc=<optimized out>,
> > buf=buf@entry=0x7fffe8dffdd0 "", buflen=buflen@entry=20,
> > errp=errp@entry=0x0) at ../io/channel.c:330
> > #5  0x0000555555c27e75 in nbd_write (errp=0x0, size=20,
> > buffer=0x7fffe8dffdd0, ioc=<optimized out>) at ../nbd/nbd-internal.h:71
> > #6  nbd_negotiate_send_rep_len (client=client@entry=0x555556f60930,
> > type=type@entry=1, len=len@entry=0, errp=errp@entry=0x0) at
> > ../nbd/server.c:203
> > #7  0x0000555555c29db1 in nbd_negotiate_send_rep (errp=0x0, type=1,
> > client=0x555556f60930) at ../nbd/server.c:211
> > --Type <RET> for more, q to quit, c to continue without paging--
> > #8  nbd_negotiate_options (errp=0x7fffe8dffe88, client=<optimized out>)
> > at ../nbd/server.c:1224
> > #9  nbd_negotiate (errp=0x7fffe8dffe88, client=<optimized out>) at
> > ../nbd/server.c:1340
> > #10 nbd_co_client_start (opaque=<optimized out>) at ../nbd/server.c:2715
> > #11 0x0000555555d70423 in coroutine_trampoline (i0=<optimized out>,
> > i1=<optimized out>) at ../util/coroutine-ucontext.c:173
> > #12 0x00007ffff67f3820 in ?? () from target:/lib64/libc.so.6
> > #13 0x00007fffffffca80 in ?? ()
> > 
> 
> Ok the reason for this is line 89 of tests/qemu-iotests/307:
> 
> # Should ignore the iothread conflict, but then fail because of the
> # permission conflict (and not crash)
> vm.qmp_log('block-export-add', id='export1', type='nbd', node_name='null',
>                iothread='iothread1', fixed_iothread=False, writable=True)
> 
> This calls qmp_block_export_add() and then blk_exp_add(), that calls
> bdrv_invalidate_cache().
> Both functions are running in the main thread, but then
> bdrv_invalidate_cache invokes bdrv_co_invalidate_cache() as a coroutine in
> the AioContext of the given bs, so not the main loop.
> 
> So Hanna, what should we do here? This seems very similar to the discussion
> in patch 22, ie run blockdev-create (in this case block-export-add, which
> seems similar?) involving nodes in I/O threads.

My gut feeling is that the bug is that we're entering coroutine context
in the node's iothread too early. I think the only place where we really
need it is the bs->drv->bdrv_co_invalidate_cache() call.

In fact, not only permission code, but also parent->klass->activate()
must be run in the main thread. The only implementation of the callback
is blk_root_activate(), and it calls some permissions functions, too,
but also qemu_add_vm_change_state_handler(), which doesn't look thread
safe. Unlikely to ever cause problems and if it does, it won't be
reproducible, but still a bug.

So if we go back to a bdrv_invalidate_cache() that does all the graph
manipulations (and asserts that we're in the main loop) and then have a
much smaller bdrv_co_invalidate_cache() that basically just calls into
the driver, would that solve the problem?

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]