qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread


From: Daniel P. Berrange
Subject: Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread
Date: Wed, 30 Aug 2017 11:13:11 +0100
User-agent: Mutt/1.8.3 (2017-05-23)

On Wed, Aug 30, 2017 at 09:06:20AM +0200, Markus Armbruster wrote:
> "Daniel P. Berrange" <address@hidden> writes:
> 
> > On Wed, Aug 23, 2017 at 02:51:03PM +0800, Peter Xu wrote:
> 
> >> However, even with the series, it does not mean that per-monitor
> >> threads will never hang.  One example is that we can still run "info
> >> vcpus" in per-monitor threads during a paused postcopy (in that state,
> >> page faults are never handled, and "info cpus" will never return since
> >> it tries to sync every vcpus).  So to make sure it does not hang, we
> >> not only need the per-monitor thread, the user should be careful as
> >> well on how to use it.
> >> 
> >> For postcopy recovery, we may need dedicated monitor channel for
> >> recovery.  In other words, a destination VM that supports postcopy
> >> recovery would possibly need:
> >> 
> >>   -qmp MAIN_CHANNEL -qmp RECOVERY_CHANNEL
> 
> Where RECOVERY_CHANNEL isn't necessarily just for postcopy, but for any
> "emergency" QMP access.  If you use it only for commands that cannot
> hang (i.e. terminate in bounded time), then you'll always be able to get
> commands accepted there in bounded time.
> 
> > I think this is a really horrible thing to expose to management 
> > applications.
> > They should not need to be aware of fact that QEMU is buggy and thus 
> > requires
> > that certain commands be run on different monitors to work around the bug.
> 
> These are (serious) design limitations, not bugs in the narrow sense of
> the word.
> 
> However, I quite agree that the need for clients to know whether a
> monitor command can hang is impractical for the general case.  What
> might be practical is a QMP monitor mode that accepts only known
> hang-free commands.  Hang-free could be introspectable.
> 
> In case you consider that ugly: it's best to explore the design space
> first, and recoil from "ugly" second.

Actually you slightly mis-interpreted me there. I think it is ok for
applications to have knowledge about whether a particular command
may hang or not. Given that knowledge it should *not*, however, require
that the application issue such commands on separate monitor channels.
It is entirely possible to handle hang-free commands on the existing
channel.

> > I'd much prefer to see the problem described handled transparently inside
> > QEMU. One approach is have a dedicated thread in QEMU responsible for all
> > monitor I/O. This thread should never actually execute monitor commands
> > though, it would simply parse the command request and put data onto a queue
> > of pending commands, thus it could never hang. The command queue could be
> > processed by the main thread, or by another thread that is interested.
> > eg the migration thread could process any queued commands related to
> > migration directly.
> 
> The monitor itself can't hang then, but the thread(s) dequeuing parsed
> commands can.

If certain commands are hang-free then you can have a dedicated thread
that only de-queues & processes the hang-free commands. The approach I
outlined is exactly how libvirt deals with its own RPC dispatch. We have
certain commands that are guaranteed to not hang, which are processed by
a dedicated pool of threads. So even if all normal RPC commands have
hung, you can still run a subset of hang-free RPC commands.

> 
> To maintain commands' synchronous semantics, their replies need to be
> sent in order, which of course reintroduces the hangs.

The requirement for such ordering is just an arbitrary restriction that
QEMU currently imposes. It is reasonable to allow arbitrary ordering of
responses (which is what libvirt does in its RPC layer). Admittedly at
this stage though, we would likely require some "opt in" handshake when
initializing QMP for the app to tell QEMU it can cope with out of order
replies. It would require that each command request has a unique serial
number, which is included in the associated reply, so apps can match
them up. We used to have that but iirc it was then removed.

There's other ways to deal with this, such as the job starting idea you
mention below.

The key point though is that I don't think creating multiple monitor
servers is a desirable approach - it is just a hack to avoid dealing
with the root cause problems. 

> Let's take a step back from the implementation, and talk about
> *behavior* instead.
> 
> You prefer to have "the problem described handled transparently inside
> QEMU".  I read that as "QEMU must ensure the QMP monitor is available at
> all times".  "Available" means it accepts commands in bounded time.
> Some commands will always finish in bounded time once accepted, others
> may not, and whether they do may depend on the commands currently in
> flight.
> 
> Commands that can always start and always terminate in bounded time are
> no problem.
> 
> All the other commands have to become "job-starting": the QMP command
> kicks off a "job", which runs concurrently with the QMP monitor for some
> (possibly unbounded) time, then finishes.  Jobs can be examined (say to
> monitor progress, if the job supports that) and controlled (say to
> cancel, if the job supports that).
> 
> A few commands are already job-starting: migrate, the block job family,
> dump-guest-memory with detach=true.  Whether they're already hang-free I
> can't say; they could do risky work in their synchronous part.
> 
> Many commands that can hang are not job-starting.
> 
> Changing a command from "do the job" to "merely start the job" is a
> compatibility break.
> 
> We could make the change opt-in to preserve compatibility.  But is
> preserving a compatible QMP monitor that is prone to hang wortwhile?
> 
> If no, we may choose to use the resulting compatibility break to also
> switch the packaging of jobs from the current "synchronous command +
> broadcast message when done" to some variation of asynchronous command.
> But that should be discussed in a separate thread, and only after we
> know how we plan to ensure monitor availability.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]