qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/6] block: don't acquire AioContext lock in bdrv_drain_all()


From: Kevin Wolf
Subject: Re: [PATCH 1/6] block: don't acquire AioContext lock in bdrv_drain_all()
Date: Wed, 8 Mar 2023 09:48:17 +0100

Am 07.03.2023 um 20:20 hat Stefan Hajnoczi geschrieben:
> On Tue, Mar 07, 2023 at 06:17:22PM +0100, Kevin Wolf wrote:
> > Am 01.03.2023 um 21:57 hat Stefan Hajnoczi geschrieben:
> > > There is no need for the AioContext lock in bdrv_drain_all() because
> > > nothing in AIO_WAIT_WHILE() needs the lock and the condition is atomic.
> > > 
> > > Note that the NULL AioContext argument to AIO_WAIT_WHILE() is odd. In
> > > the future it can be removed.
> > 
> > It can be removed for all callers that run in the main loop context. For
> > code running in an iothread, it's still important to pass a non-NULL
> > context. This makes me doubt that the ctx parameter can really be
> > removed without changing more.
> > 
> > Is your plan to remove the if from AIO_WAIT_WHILE_INTERNAL(), too, and
> > to poll qemu_get_current_aio_context() instead of ctx_ or the main
> > context?
> 
> This is what I'd like once everything has been converted to
> AIO_WAIT_WHILE_UNLOCKED() - and at this point we might as well call it
> AIO_WAIT_WHILE() again:
> 
>   #define AIO_WAIT_WHILE(cond) ({                                    \
>       bool waited_ = false;                                          \
>       AioWait *wait_ = &global_aio_wait;                             \
>       /* Increment wait_->num_waiters before evaluating cond. */     \
>       qatomic_inc(&wait_->num_waiters);                              \
>       /* Paired with smp_mb in aio_wait_kick(). */                   \
>       smp_mb();                                                      \
>       while ((cond)) {                                               \
>           aio_poll(qemu_get_current_aio_context(), true);            \
>           waited_ = true;                                            \
>       }                                                              \
>       qatomic_dec(&wait_->num_waiters);                              \
>       waited_; })

Ok, yes, this is what I tried to describe above.

> However, I just realized this only works in the main loop thread because
> that's where aio_wait_kick() notifications are received. An IOThread
> running AIO_WAIT_WHILE() won't be woken when another thread (including
> the main loop thread) calls aio_wait_kick().

Which is of course a limitation we already have today. You can wait for
things in your own iothread, or for all threads from the main loop.

However, in the future multiqueue world, the first case probably becomes
pretty much useless because even for the same node, you could get
activity in any thread.

So essentially AIO_WAIT_WHILE() becomes GLOBAL_STATE_CODE(). Which is
probably a good idea anyway, but I'm not entirely sure how many places
we currently have where it's called from an iothread. I know the drain
in mirror_run(), but Emanuele already had a patch in his queue where
bdrv_co_yield_to_drain() schedules drain in the main context, so if that
works, mirror_run() would be solved.

https://gitlab.com/eesposit/qemu/-/commit/63562351aca4fb05d5711eb410feb96e64b5d4ad

> I would propose introducing a QemuCond for each condition that we wait
> on, but QemuCond lacks event loop integration. The current thread would
> be unable to run aio_poll() while also waiting on a QemuCond.
> 
> Life outside coroutines is hard, man! I need to think about this more.
> Luckily this problem doesn't block this patch series.

I hope that we don't really need all of this if we can limit running
synchronous code to the main loop.

> > > There is an assertion in
> > > AIO_WAIT_WHILE() that checks that we're in the main loop AioContext and
> > > we would lose that check by dropping the argument. However, that was a
> > > precursor to the GLOBAL_STATE_CODE()/IO_CODE() macros and is now a
> > > duplicate check. So I think we won't lose much by dropping it, but let's
> > > do a few more AIO_WAIT_WHILE_UNLOCKED() coversions of this sort to
> > > confirm this is the case.
> > > 
> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > 
> > Yes, it seems that we don't lose much, except maybe some consistency in
> > the intermediate state. The commit message could state a bit more
> > directly what we gain, though. Since you mention removing the parameter
> > as a future possibility, I assume that's the goal with it, but I
> > wouldn't be sure just from reading the commit message.
> 
> AIO_WAIT_WHILE() callers need to be weened of the AioContext lock.
> That's the main motivation and this patch series converts the easy
> cases where we already don't need the lock. Dropping the function
> argument eventually is a side benefit.

Yes, but the conversion to AIO_WAIT_WHILE_UNLOCKED() could be done with
ctx instead of NULL. So moving to NULL is a separate change that needs a
separate explanation. You could even argue that it should be a separate
patch if it's an independent change.

Or am I missing something and keeping ctx would actually break things?

Kevin

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]