qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/23] block: Lock the graph, part 2 (BlockDriver callbacks)


From: Stefan Hajnoczi
Subject: Re: [PATCH 00/23] block: Lock the graph, part 2 (BlockDriver callbacks)
Date: Thu, 23 Feb 2023 15:33:51 -0500

On Thu, Feb 23, 2023 at 12:48:18PM +0100, Kevin Wolf wrote:
> Am 21.02.2023 um 23:13 hat Stefan Hajnoczi geschrieben:
> > On Fri, Feb 03, 2023 at 04:21:39PM +0100, Kevin Wolf wrote:
> > > After introducing the graph lock in a previous series, this series
> > > actually starts making widespread use of it.
> > > 
> > > Most of the BlockDriver callbacks access the children list in some way,
> > > so you need to hold the graph lock to call them. The patches in this
> > > series add the corresponding GRAPH_RDLOCK annotations and take the lock
> > > in places where it doesn't happen yet - all of the bdrv_*() co_wrappers
> > > are already covered, but in particular BlockBackend coroutine_fns still
> > > need it.
> > > 
> > > There is no particularly good reason why exactly these patches and not
> > > others are included in the series. I couldn't find a self-contained part
> > > that could reasonable be addressed in a single series. So these just
> > > happen to be patches that are somewhat related (centered around the
> > > BlockDriver callback theme), are ready and their number looks
> > > manageable. You will still see some FIXMEs at the end of the series
> > > that will only be addressed in future patches.
> > 
> > Two things occurred to me:
> > 
> > 1. The graph lock is becoming the new AioContext lock in the sense that
> > code using the block layer APIs needs to carefully acquire and release
> > the lock around operations. Why is it necessary to explicitly take the
> > rdlock in mirror_iteration()?
> > 
> >   + WITH_GRAPH_RDLOCK_GUARD() {
> >         ret = bdrv_block_status_above(source, NULL, offset,
> > 
> > I guess because bdrv_*() APIs are unlocked? The equivalent blk_*() API
> > would have taken the graph lock internally. Do we want to continue using
> > bdrv APIs even though it spreads graph locking concerns into block jobs?
> 
> The thing that makes it a bit ugly is that block jobs mix bdrv_*() and
> blk_*() calls. If they only used blk_*() we wouldn't have to take care
> of locking (but that means that the job code itself must not have a
> problem with a changing graph!). If they only used bdrv_*(), the
> function could just take a lock at the start and only temporarily
> release it around pause points. Both ways would look nicer than what we
> have now.
> 
> > 2. This series touches block drivers like qcow2. Luckily block drivers
> > just need to annotate their BlockDriver functions to indicate they run
> > under the rdlock, a lock that the block driver itself doesn't mess with.
> > It makes me wonder whether there is any point in annotating the
> > BlockDriver function pointers? It would be simpler if the block drivers
> > were unaware of the graph lock.
> 
> If you're unaware of the graph lock, how do you tell if you can call
> certain block layer functions that require the lock?
> 
> Especially since different BlockDriver callbacks have different rules
> (some have a reader lock, some have a writer lock, and some may stay
> unlocked even in the future), it would seem really hard to keep track of
> this when you don't make it explicit.

I discussed this offline with Kevin some more today. While there might
be opportunities to hide the lock (thereby making it easier), it's not
easy to do because we don't want to give up TSA static checking. Let's
put the graph lock in place first and worry about that later.

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]