[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
From: |
Kevin Wolf |
Subject: |
Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands |
Date: |
Fri, 6 Mar 2020 15:26:40 +0100 |
User-agent: |
Mutt/1.12.1 (2019-06-15) |
Am 06.03.2020 um 13:38 hat Markus Armbruster geschrieben:
> Kevin Wolf <address@hidden> writes:
>
> > Am 06.03.2020 um 08:25 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <address@hidden> writes:
> >>
> >> > Am 05.03.2020 um 16:30 hat Markus Armbruster geschrieben:
> >> >> Kevin Wolf <address@hidden> writes:
> >> >>
> >> >> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> >> >> >> Kevin Wolf <address@hidden> writes:
> >> >> >>
> >> >> >> > This patch adds a new 'coroutine' flag to QMP command definitions
> >> >> >> > that
> >> >> >> > tells the QMP dispatcher that the command handler is safe to be
> >> >> >> > run in a
> >> >> >> > coroutine.
> >> >> >>
> >> >> >> I'm afraid I missed this question in my review of v3: when is a
> >> >> >> handler
> >> >> >> *not* safe to be run in a coroutine?
> >> >> >
> >> >> > That's a hard one to answer fully.
> >> >> >
> >> >> > Basically, I think the biggest problem is with calling functions that
> >> >> > change their behaviour if run in a coroutine compared to running them
> >> >> > outside of coroutine context. In most cases the differences like
> >> >> > having
> >> >> > a nested event loop instead of yielding are just fine, but they are
> >> >> > still subtly different.
> >> >> >
> >> >> > I know this is vague, but I can assure you that problematic cases
> >> >> > exist.
> >> >> > I hit one of them with my initial hack that just moved everything
> >> >> > into a
> >> >> > coroutine. It was related to graph modifications and bdrv_drain and
> >> >> > resulted in a hang. For the specifics, I would have to try and
> >> >> > reproduce
> >> >> > the problem again.
> >> >>
> >> >> I'm afraid it's even more complicated.
> >> >>
> >> >> Monitors (HMP and QMP) run in the main loop. Before this patch, HMP and
> >> >> QMP commands run start to finish, one after the other.
> >> >>
> >> >> After this patch, QMP commands may elect to yield. QMP commands still
> >> >> run one after the other (the shared dispatcher ensures this even when we
> >> >> have multiple QMP monitors).
> >> >>
> >> >> However, *HMP* commands can now run interleaved with a coroutine-enabled
> >> >> QMP command, i.e. between a yield and its re-enter.
> >> >
> >> > I guess that's right. :-(
> >> >
> >> >> Consider an HMP screendump running in the middle of a coroutine-enabled
> >> >> QMP screendump, using Marc-André's patch. As far as I can tell, it
> >> >> works, because:
> >> >>
> >> >> 1. HMP does not run in a coroutine. If it did, and both dumps dumped
> >> >> the same @con, then it would overwrite con->screndump_co. If we ever
> >> >> decide to make HMP coroutine-capable so it doesn't block the main loop,
> >> >> this will become unsafe in a nasty way.
> >> >
> >> > At the same time, switching HMP to coroutines would give us an easy way
> >> > to fix the problem: Just use a CoMutex so that HMP and QMP commands
> >> > never run in parallel. Unless we actually _want_ to run both at the same
> >> > time for some commands, but I don't see why.
> >>
> >> As long as running QMP commands from *all* monitors one after the other
> >> is good enough, I can't see why running HMP commands interleaved is
> >> worth the risk.
> >
> > There is one exception, actually: Obviously, human-monitor-command must
> > allow HMP commands to run in parallel.
>
> Yes.
>
> >> > While we don't have this, maybe it's worth considering if there is
> >> > another simple way to achieve the same thing. Could QMP just suspend all
> >> > HMP monitors while it's executing a command? At first sight it seems
> >> > that this would work only for "interactive" monitors.
> >>
> >> I believe the non-"interactive" HMP code is used only by gdbstub.c.
> >
> > monitor_init_hmp() is called from (based on my block branch):
> >
> > * monitor_init(): This is interactive.
> > * qemu_chr_new_noreplay(): Interactive, too.
> > * gdbserver_start(): Non-interactive.
> >
> > There is also qmp_human_monitor_command(), which manually creates a
> > MonitorHMP without going through monitor_init_hmp(). It does call
> > monitor_data_init(), though. There are no additional callers of
> > monitor_data_init() and I think I would have added it to every creation
> > of a Monitor object when I did the QMP/HMP split of the struct.
> >
> > So GDB and human-monitor-command should be the two non-interactive
> > cases.
>
> Yes.
>
> >> I keep forgetting our GDB server stub creates an "HMP" monitor.
> >> Possibly because I don't understand why. Alex, Philippe, can you
> >> enlighten me?
> >
> > I think you can send HMP commands from within gdb with it:
> >
> > (gdb) tar rem:1234
> > Remote debugging using :1234
> > [...]
> > (gdb) monitor info block
> > ide1-cd0: [not inserted]
> > Attached to: /machine/unattached/device[23]
> > Removable device: not locked, tray closed
> >
> > floppy0: [not inserted]
> > Attached to: /machine/unattached/device[16]
> > Removable device: not locked, tray closed
> >
> > sd0: [not inserted]
> > Removable device: not locked, tray closed
>
> Argh!
>
> Any idea where we define GDB command "monitor"?
Just following the s->mon_chr that is assigned in gdbserver_start(), it
looks like handle_query_rcmd() sends the HMP command to the monitor.
gdb_gen_query_table has a function pointer to handle_query_rcmd() for
the gdb protocol command "Rcmd", and I think this is what gdb will send
for the "monitor" command.
> > So we do want stop it from processing requests while a QMP command is
> > running.
>
> Then a slow QMP command can interfere with debugging.
>
> Perhaps we can synchronize just the "monitor" command.
I didn't mean stop processing of gdb commands, but only of HMP commands
submitted via gdb (which will probably still block gdb while it's
waiting for a response, but only if a "monitor" command was given).
This is probably still not trivial because so far we have no buffering
involved anywhere and handle_query_rcmd() (or probably the whole gdbstub
command processing) is synchronous. Maybe run a nested event loop until
the QMP command is done or something.
Kevin