qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 4/7] qapi: remove COMMAND_DROPPED event


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v7 4/7] qapi: remove COMMAND_DROPPED event
Date: Wed, 5 Sep 2018 11:53:49 +0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Tue, Sep 04, 2018 at 10:04:00AM +0200, Markus Armbruster wrote:
> Peter Xu <address@hidden> writes:
> 
> > On Mon, Sep 03, 2018 at 03:41:16PM +0100, Daniel P. Berrangé wrote:
> >> On Mon, Sep 03, 2018 at 09:30:52AM -0500, Eric Blake wrote:
> >> > On 09/03/2018 08:31 AM, Markus Armbruster wrote:
> >> > 
> >> > > Example:
> >> > > 
> >> > >      client sends in-band command #1
> >> > >      QEMU reads and queues
> >> > >      QEMU dequeues in-band command #1
> >> > >      in-band command #1 starts executing, but it's slooow
> >> > >      client sends in-band command #2
> >> > >      QEMU reads and queues
> >> > >      ...
> >> > >      client sends in-band command #8
> >> > >      QEMU reads, queues and suspends the monitor
> >> > >      client sends out-of-band command
> >> > > --> time passes...
> >> > >      in-band command #1 completes, QEMU sends reply
> >> > >      QEMU dequeues in-band command #2, resumes the monitor
> >> > >      in-band command #2 starts executing
> >> > >      QEMU reads and executes out-of-band command
> >> > >      out-of-band command completes, QEMU sends reply
> >> > >      in-band command #2 completes, QEMU sends reply
> >> > >      ... same for remaining in-band commands ...
> >> > > 
> >> > > The out-of-band command gets stuck behind the in-band commands.
> >
> > (It's a shame of me to have just noticed that the out-of-band command
> >  will be stuck after we dropped the COMMAND_DROP event... so now I
> >  agree it's not that ideal any more to drop the event but maybe still
> >  preferable)
> 
> We can queue without limit, we can drop commands, or we can suspend
> reading.  Each of these has drawbacks:
> 
> * Queuing without limit is simple for the client, but unsafe for QEMU.
> 
> * Dropping commands requires the client to cope with dropped commands.
>   As currently designed, it's just as unsafe for QEMU: instead of
>   queuing commands without limit, we get to queue their COMMAND_DROPPED
>   events without limit.  A better design could avoid this flaw.
> 
> * Suspending reading requires the client to manage the flow of commands
>   if it wants to keep the monitor available for out-of-band commands.
> 
> We decided that clients having to manage the flow of commands is no
> worse than clients having to cope with dropped commands.  There's still
> time to challenge this decision.
> 
> This series acts upon the decision: it switches from dropping commands
> to suspending reading.  Makes the input direction safe.  The output
> direction remains as unsafe as it's always been.  Fixing that is left
> for later.

Yes.  Options (1) and (2) seems not really acceptable for me, but my
conclusion is based on that I think QEMU should still protect itself
from the client.  Take the example of QEMU & Libvirt: I think death or
bug of either of the program should not affect the other one.  But
maybe I misunderstood somewhere since I saw that you emphasized it at
[1] below...

And for (3), I really think a proper client should never trigger that
queue full state.  Hopefully with that then the client would never
lost the out-of-band feature due to a stuck input channel.

> 
> >> > > 
> >> > > The client can avoid this by managing the flow of in-band commands: 
> >> > > have
> >> > > no more than 7 in flight, so the monitor never gets suspended.
> >> > > 
> >> > > This is a potentially useful thing to do for clients, isn't it?
> >> > > 
> >> > > Eric, Daniel, is it something libvirt would do?
> >> > 
> >> > Right now, libvirt serializes commands - it never sends a QMP command 
> >> > until
> >> > the previous command's response has been processed. But that may not help
> >> > much, since libvirt does not send OOB commands.
> >> 
> >> Note that is not merely due to the QMP monitor restriction either.
> >> 
> >> Libvirt serializes all its public APIs that can change state of a running
> >> domain.  It usually aims to allow read-only APIs to be run in parallel with
> >> APIs that change state.
> >> 
> >> The exception to the rule right now are some of the migration APIs which
> >> we allow to be invoked to manage the migration process. 
> >> 
> >> > I guess when we are designing what libvirt should do, and deciding WHEN 
> >> > it
> >> > should send OOB commands, we have the luxury of designing libvirt to 
> >> > enforce
> >> > how many in-flight in-band commands it will ever have pending at once
> >> > (whether the current 'at most 1', or even if we make it more parallel to 
> >> > 'at
> >> > most 7'), so that we can still be ensured that the OOB command will be
> >> > processed without being stuck in the queue of suspended in-band commands.
> >> > If we never send more than one in-band at a time, then it's not a concern
> >> > how deep the qemu queue is; but if we do want libvirt to start parallel
> >> > in-band commands, then you are right that having a way to learn the qemu
> >> > queue depth is programmatically more precise than just guessing the 
> >> > maximum
> >> > depth.  But it's also hard to argue we need that complexity if we don't 
> >> > have
> >> > an immediate use envisioned for it.
> >> 
> >> In terms of what libvirt would want to parallelize, I think it is 
> >> reasonable
> >> to consider any of the query-XXXX commands desirable. Other stuff is likely
> >> to remain serialized from libvirt's side.
> >
> > IMHO concurrency won't help much now even for query commands, since
> > our current concurrency is still "partly" - the executions of query
> > commands (which is the most time consuming part) will still be done
> > sequentially, so even if we send multiple query commands in parallel
> > (without waiting for a response of any sent commands), the total time
> > used for the list of commands would be mostly the same.
> 
> Yes.  We execute all in-band commands (regardless of their monitor) in
> the main thread.  Out-of-band commands can execute in @mon_iothread,
> which provides a modest degree of concurrency.
> 
> > My understanding for why we have such a queue length now is that it
> > came from a security concern: after we have a queue, we need that
> > queue length to limit the memory usages for the QMP server.  Though
> > that might not help much for real users like Libvirt, it's majorly
> > serving as a way to protect QEMU QMP from being attacked or from being
> > turned down by a buggy QMP client.
> 
> Yes.
> 
> QEMU has to trust its QMP clients, so malice is not a concern, but
> accidents are.  Robust software does not buffer without bounds.

[1]

> 
> > But I agree now that the queue length information might still be
> > helpful some day.  Maybe, we can hide that until we support executing
> > commands in parallel for some of them.
> 
> Queue length can become interesting long before we get general
> concurrency.
> 
> If you use QMP only synchronously (send command #1; receive reply #1;
> send command #2; ...), then out-of-band does exactly nothing for you.
> To make use of it, you have to send an out-of-band command *before* you
> receive the previous command's reply.  That's a form of pipelining.

Yes, out-of-band should be special here, but as Dave has already
mentioned (possibly someone else too) that we may just need a length=1
queue for in-band command and length=1 queue for out-of-band command
and that should be enough at least for now (say, oob command will
never block, and oob commands will be executed once a time).  By that
extra length=1 out-of-band queue we gain the ability to talk to QMP
any time we want when necessary (though with limited list of cmds).

> 
> Note there's still no general concurrency.  There's a bit of pipelining,
> and there's a bit of concurrency between one in-band command (executing
> in main thread) and out-of-band command (executing in @mon_iothread).
> 
> Since we need to support a bit of pipelining anyway, why not support it
> more generally?  All it takes it raising the queue length limit above
> the minimum required for the use of OOB I just sketched.
> 
> Note that "since we need to support a bit of concurrency anyway, why not
> support it more generally?" would be ludicrously naive :)

Regards,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]