[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/6] block: don't acquire AioContext lock in bdrv_drain_all()
From: |
Stefan Hajnoczi |
Subject: |
Re: [PATCH 1/6] block: don't acquire AioContext lock in bdrv_drain_all() |
Date: |
Thu, 9 Mar 2023 07:38:45 -0500 |
On Wed, Mar 08, 2023 at 06:25:43PM +0100, Kevin Wolf wrote:
> Am 08.03.2023 um 15:26 hat Stefan Hajnoczi geschrieben:
> > On Wed, Mar 08, 2023 at 09:48:17AM +0100, Kevin Wolf wrote:
> > > Am 07.03.2023 um 20:20 hat Stefan Hajnoczi geschrieben:
> > > > On Tue, Mar 07, 2023 at 06:17:22PM +0100, Kevin Wolf wrote:
> > > > > Am 01.03.2023 um 21:57 hat Stefan Hajnoczi geschrieben:
> > > > > > There is no need for the AioContext lock in bdrv_drain_all() because
> > > > > > nothing in AIO_WAIT_WHILE() needs the lock and the condition is
> > > > > > atomic.
> > > > > >
> > > > > > Note that the NULL AioContext argument to AIO_WAIT_WHILE() is odd.
> > > > > > In
> > > > > > the future it can be removed.
> > > > >
> > > > > It can be removed for all callers that run in the main loop context.
> > > > > For
> > > > > code running in an iothread, it's still important to pass a non-NULL
> > > > > context. This makes me doubt that the ctx parameter can really be
> > > > > removed without changing more.
> > > > >
> > > > > Is your plan to remove the if from AIO_WAIT_WHILE_INTERNAL(), too, and
> > > > > to poll qemu_get_current_aio_context() instead of ctx_ or the main
> > > > > context?
> > > >
> > > > This is what I'd like once everything has been converted to
> > > > AIO_WAIT_WHILE_UNLOCKED() - and at this point we might as well call it
> > > > AIO_WAIT_WHILE() again:
> > > >
> > > > #define AIO_WAIT_WHILE(cond) ({ \
> > > > bool waited_ = false; \
> > > > AioWait *wait_ = &global_aio_wait; \
> > > > /* Increment wait_->num_waiters before evaluating cond. */ \
> > > > qatomic_inc(&wait_->num_waiters); \
> > > > /* Paired with smp_mb in aio_wait_kick(). */ \
> > > > smp_mb(); \
> > > > while ((cond)) { \
> > > > aio_poll(qemu_get_current_aio_context(), true); \
> > > > waited_ = true; \
> > > > } \
> > > > qatomic_dec(&wait_->num_waiters); \
> > > > waited_; })
> > >
> > > Ok, yes, this is what I tried to describe above.
> > >
> > > > However, I just realized this only works in the main loop thread because
> > > > that's where aio_wait_kick() notifications are received. An IOThread
> > > > running AIO_WAIT_WHILE() won't be woken when another thread (including
> > > > the main loop thread) calls aio_wait_kick().
> > >
> > > Which is of course a limitation we already have today. You can wait for
> > > things in your own iothread, or for all threads from the main loop.
> > >
> > > However, in the future multiqueue world, the first case probably becomes
> > > pretty much useless because even for the same node, you could get
> > > activity in any thread.
> > >
> > > So essentially AIO_WAIT_WHILE() becomes GLOBAL_STATE_CODE(). Which is
> > > probably a good idea anyway, but I'm not entirely sure how many places
> > > we currently have where it's called from an iothread. I know the drain
> > > in mirror_run(), but Emanuele already had a patch in his queue where
> > > bdrv_co_yield_to_drain() schedules drain in the main context, so if that
> > > works, mirror_run() would be solved.
> > >
> > > https://gitlab.com/eesposit/qemu/-/commit/63562351aca4fb05d5711eb410feb96e64b5d4ad
> > >
> > > > I would propose introducing a QemuCond for each condition that we wait
> > > > on, but QemuCond lacks event loop integration. The current thread would
> > > > be unable to run aio_poll() while also waiting on a QemuCond.
> > > >
> > > > Life outside coroutines is hard, man! I need to think about this more.
> > > > Luckily this problem doesn't block this patch series.
> > >
> > > I hope that we don't really need all of this if we can limit running
> > > synchronous code to the main loop.
> >
> > Great idea, I think you're right.
> >
> > I'll audit the code to find the IOThread AIO_WAIT_WHILE() callers and
> > maybe a future patch series can work on that.
> >
> > > > > > There is an assertion in
> > > > > > AIO_WAIT_WHILE() that checks that we're in the main loop AioContext
> > > > > > and
> > > > > > we would lose that check by dropping the argument. However, that
> > > > > > was a
> > > > > > precursor to the GLOBAL_STATE_CODE()/IO_CODE() macros and is now a
> > > > > > duplicate check. So I think we won't lose much by dropping it, but
> > > > > > let's
> > > > > > do a few more AIO_WAIT_WHILE_UNLOCKED() coversions of this sort to
> > > > > > confirm this is the case.
> > > > > >
> > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > >
> > > > > Yes, it seems that we don't lose much, except maybe some consistency
> > > > > in
> > > > > the intermediate state. The commit message could state a bit more
> > > > > directly what we gain, though. Since you mention removing the
> > > > > parameter
> > > > > as a future possibility, I assume that's the goal with it, but I
> > > > > wouldn't be sure just from reading the commit message.
> > > >
> > > > AIO_WAIT_WHILE() callers need to be weened of the AioContext lock.
> > > > That's the main motivation and this patch series converts the easy
> > > > cases where we already don't need the lock. Dropping the function
> > > > argument eventually is a side benefit.
> > >
> > > Yes, but the conversion to AIO_WAIT_WHILE_UNLOCKED() could be done with
> > > ctx instead of NULL. So moving to NULL is a separate change that needs a
> > > separate explanation. You could even argue that it should be a separate
> > > patch if it's an independent change.
> > >
> > > Or am I missing something and keeping ctx would actually break things?
> >
> > Yes, ctx argument does not need to be modified when converting from
> > AIO_WAIT_WHILE() to AIO_WAIT_WHILE_UNLOCKED(). Passing it bothers me
> > because we don't really use it when unlock=false.
> >
> > Would you like me to keep ctx non-NULL for now?
>
> I don't really mind doing both changes in one commit because they are so
> small, but at least I'd like the commit message to be more explicit
> about the eventual goal we have with switching to NULL instead of just
> stating that it's odd, but harmless.
Got it, I'll send another revision.
Stefan
signature.asc
Description: PGP signature
- Re: [PATCH 2/6] block: convert blk_exp_close_all_type() to AIO_WAIT_WHILE_UNLOCKED(), (continued)
Re: [PATCH 2/6] block: convert blk_exp_close_all_type() to AIO_WAIT_WHILE_UNLOCKED(), Philippe Mathieu-Daudé, 2023/03/07
[PATCH 1/6] block: don't acquire AioContext lock in bdrv_drain_all(), Stefan Hajnoczi, 2023/03/01
[PATCH 3/6] block: convert bdrv_graph_wrlock() to AIO_WAIT_WHILE_UNLOCKED(), Stefan Hajnoczi, 2023/03/01
[PATCH 4/6] block: convert bdrv_drain_all_begin() to AIO_WAIT_WHILE_UNLOCKED(), Stefan Hajnoczi, 2023/03/01
[PATCH 5/6] hmp: convert handle_hmp_command() to AIO_WAIT_WHILE_UNLOCKED(), Stefan Hajnoczi, 2023/03/01