qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine


From: Kevin Wolf
Subject: Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
Date: Wed, 19 Feb 2020 11:22:26 +0100
User-agent: Mutt/1.12.1 (2019-06-15)

Am 19.02.2020 um 10:03 hat Markus Armbruster geschrieben:
> >> >>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
> >> >>            }
> >> >>            qmp_request_free(req_obj);
> >> >> 
> >> >>   -    /* Reschedule instead of looping so the main loop stays 
> >> >> responsive */
> >> >>   -    qemu_bh_schedule(qmp_dispatcher_bh);
> >> >>   +        /*
> >> >>   +         * Yield and reschedule so the main loop stays responsive.
> >> >>   +         *
> >> >>   +         * Move back to iohandler_ctx so that nested event loops for
> >> >>   +         * qemu_aio_context don't start new monitor commands.
> >> >> 
> >> >> Can you explain this sentence for dummies?
> >> >
> >> > Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
> >> > are scheduled there, the next iteration of the monitor dispatcher loop
> >> > could start from a nested event loop. If we are scheduled in
> >> > iohandler_ctx, then only the actual main loop will reenter the coroutine
> >> > and nested event loops ignore it.
> >> >
> >> > I'm not sure if or why this is actually important, but this matches
> >> > scheduling the dispatcher BH in iohandler_ctx in the code before this
> >> > patch.
> >> >
> >> > If we didn't do this, we could end up starting monitor requests in more
> >> > places than before, and who knows what that would mean.
> >> 
> >> Let me say it in my own words, to make sure I got it.  I'm going to
> >> ignore special cases like "not using I/O thread" and exec-oob.
> >> 
> >> QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
> >> This pushes requests onto the monitor's qmp_requests queue.
> >
> > mon_iothread (like all iothreads) has a separate AioContext, which
> > doesn't have a name, but can be accessed with
> > iothread_get_aio_context(mon_iothread).
> 
> Got it.
> 
> @iohandler_ctx and @qemu_aio_context both belong to the main loop.
> 
> @qemu_aio_context is for general "run in the main loop" use.  It is
> polled both in the actual main loop and event loops nested in it.  I
> figure "both" to keep things responsive.
> 
> @iohandler_ctx is for "I/O handlers" (whatever those are).  It is polled
> only in the actual main loop.  I figure this means "I/O handlers" may
> get delayed while a nested event loop runs.  But better starve a bit
> than run in unexpected places.
> 
> >> Before this patch, the dispatcher runs in a bottom half in the main
> >> thread, in qemu_aio_context.
> >
> > The dispatcher BH is what is scheduled in iohandler_ctx. This means that
> > the BH is run from the main loop, but not from nested event loops.
> 
> Got it.
> 
> >> The patch moves it to a coroutine running in the main thread.  It runs
> >> in iohandler_ctx, but switches to qemu_aio_context for executing command
> >> handlers.
> >> 
> >> We want to keep command handlers running in qemu_aio_context, as before
> >> this patch.
> >
> > "Running in AioContext X" is kind of a sloppy term, and I'm afraid it
> > might be more confusing than helpful here. What this means is that the
> > code is run while holding the lock of the AioContext, and it registers
> > its BHs, callbacks etc. in that AioContext so it will be called from the
> > event loop of the respective thread.
> >
> > Before this patch, command handlers can't really use callbacks without a
> > nested event loop because they are synchronous.
> 
> I figure an example for such a nested event loop is BDRV_POLL_WHILE() in
> bdrv_truncate(), which runs in the command handler for block_resize.
> True?

Yes. I think most nested event loops are in the block layer, and they
use the BDRV_POLL_WHILE() or AIO_WAIT_WHILE() macros today.

> >                                                 The BQL is held, which
> > is equivalent to holding the qemu_aio_context lock.
> 
> Why is it equivalent?  Are they always taken together?

I guess ideally they would be. In practice, we neglect the
qemu_aio_context lock and rely on the BQL for everything in the main
thread. So maybe I should say the BQL replaces the AioContext lock for
the main context rather than being equivalent.

> >                                                     But other than that,
> > all of the definition of "running in an AioContext" doesn't really apply.
> >
> > Now with coroutines, you assign them an AioContext, which is the context
> > in which BHs reentering the coroutine will be scheduled, e.g. from
> > explicit aio_co_schedule(), after qemu_co_sleep_ns() or after waiting
> > for a lock like a CoMutex.
> >
> > qemu_aio_context is what most (if not all) of the existing QMP handlers
> > use when they run a nested event loop,
> 
> bdrv_truncate() appears to use bdrv_get_aio_context(), which is the
> BlockDriverState's AioContext if set, else @qemu_aio_context.

Right, the BDRV_POLL_WHILE() macro allows waiting for something in a
different AioContext from the main context. This works because of the
aio_wait_kick() in bdrv_truncate_co_entry(), which schedules a BH in the
main context, and this is what BDRV_POLL_WHILE() is waiting for.

So the command handler runs in the main context (and thread), whereas
the bdrv_co_truncate() runs in the AioContext (and thread) of the BDS.

> >                                        so running the coroutine in this
> > context while calling the handlers will make the transition the easiest.
> 
> In the case of block_resize: running it in a coroutine makes
> bdrv_truncate() use that coroutine instead of creating one together with
> a nested event loop.  "That coroutine" uses @qemu_aio_context.  So does
> the nested event loop, *except* when the BlockDriverState has an
> AioContext.
> 
> I have no idea when a BlockDriverState has an AioContext.  Can you help?

When it's used with a dataplane device, i.e. is tied a separate
iothread.

The fact that the coroutine fastpath (not only in bdrv_truncate(), but
also everywhere else) assumes that we're already in the right AioContext
might be a bug.

If this is the only problem in v5, please consider applying only patches
1-3 (with the infrastructure) so that Marc-André's screendump patch gets
its dependency resolved, and I'll send a separate series for converting
block_resize that fixes this problem (probably by explicitly
rescheduling the coroutine into the BDS context and back in the
coroutine path).

> >> We want to run the rest in iohandler_ctx to ensure dispatching happens
> >> only in the main loop, as before this patch.
> >> 
> >> Correct?
> >
> > This part is correct.
> 
> Alright, let me try again, so you can point out the next level of my
> ignorance :)
> 
> QMP monitor I/O happens in I/O thread @mon_iothread, in the I/O thread's
> AioContext.  The I/O thread's event loop polls monitor I/O.  The read
> handler pushes requests onto the monitor's qmp_requests queue.
> 
> The dispatcher pops requests off these queues, and runs command
> handlers for them.
> 
> Before this patch, the dispatcher runs in a bottom half in the main
> thread with the BQL[*] held, in @iohandler_ctx context.  The latter
> ensures dispatch happens only from the main loop, not nested event
> loops.
> 
> The command handlers don't care for the AioContext; they're synchronous.
> If they need AIO internally, they have to set up a nested event loop.
> When they do, they pick the AioContext explicitly, so they still don't
> rely on the current context.

They do rely on being run in the main thread, but the exact context
doesn't matter. (In theory, running them in the thread of the block
node's AioContext would be fine, too, but the monitor can't know what
that is.)

See the AIO_WAIT_WHILE() documentation for details.

> The patch moves the dispatcher to a coroutine running in the main
> thread.  It runs in iohandler_ctx, but switches to qemu_aio_context for
> executing command handlers.
> 
> The dispatcher keeps running in @iohandler_ctx.  Good.
> 
> The command handlers now run in @qemu_aio_context.  Doesn't matter
> because "command handlers don't care for the AioContext".  Except when a
> command handler has a fast path like bdrv_truncate() has.  For those,
> @iohandler_ctx is generally not what we want.  @qemu_aio_context is.
> "Generally", since every one needs to be inspected for AioContext
> shenanigans.

Except for the bug that the specific code in bdrv_trunate() wants
to run the coroutine in the BDS AioContext, which may be different from
qemu_aio_context...

But I think the theory is right, for those commands that don't
potentially schedule a coroutine in a different AioContext.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]