qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 00/25] qmp: add async command type


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v2 00/25] qmp: add async command type
Date: Thu, 2 Feb 2017 10:13:50 +0000
User-agent: Mutt/1.7.1 (2016-10-04)

On Wed, Feb 01, 2017 at 08:25:10PM +0000, Marc-André Lureau wrote:
> Hi
> 
> On Wed, Feb 1, 2017 at 8:26 PM Stefan Hajnoczi <address@hidden> wrote:
> 
> > On Mon, Jan 30, 2017 at 01:18:16PM -0500, Marc-André Lureau wrote:
> > > Hi
> > >
> > > ----- Original Message -----
> > > > On Tue, Jan 24, 2017 at 01:43:17PM -0500, Marc-André Lureau wrote:
> > > > > Hi
> > > > >
> > > > > ----- Original Message -----
> > > > > > On Mon, Jan 23, 2017 at 06:27:29AM -0500, Marc-André Lureau wrote:
> > > > > > > ----- Original Message -----
> > > > > > > > On Wed, Jan 18, 2017 at 08:03:07PM +0400, Marc-André Lureau
> > wrote:
> > > > > > > > > Hi,
> > > > > > > >
> > > > > > > > CCing Jeff Cody and John Snow, who have been working on
> > generalizing
> > > > > > > > Block Job APIs to generic background jobs.  There is some
> > overlap
> > > > > > > > between async commands and background jobs.
> > > > > > >
> > > > > > > If you say so :) Did I miss a proposal or a discussion for async
> > qmp
> > > > > > > commands?
> > > > > >
> > > > > > There is no recent mailing list thread, so it's probably best to
> > discuss
> > > > > > here:
> > > > > >
> > > > > > The goal of jobs is to support long-running operations that can be
> > > > > > managed via QMP.  Jobs can have a more elaborate lifecycle than
> > just
> > > > > > start -> finish/cancel (e.g. they can be paused/resumed and may
> > have
> > > > > > multiple phases of execution that the client controls).  There are
> > QMP
> > > > > > APIs to query their state (Are they running?  How much "progress"
> > has
> > > > > > been made?).
> > > > >
> > > > > Indeed, I mention that in my cover. Such use cases require something
> > more
> > > > > complete than simple async qmp commands. I don't see why it would be
> > > > > incompatible with the usage of async qmp commands.
> > > > >
> > > > > > A client reconnecting to QEMU can query running jobs.  This way a
> > client
> > > > > > can resume with a running QEMU process.  For commands like saving a
> > > > > > screenshot is mostly does not matter, but for commands that modify
> > state
> > > > > > it's critical that clients are aware of running commands after
> > reconnect
> > > > > > to prevent corruption/interference.  This behavior is what I asked
> > about
> > > > > > in my previous mail.
> > > > >
> > > > > That's what I mention in the cover, some commands are global (and
> > > > > broadcasted events are appropriate) and some are local to the client
> > > > > context. Some could be discarded when the client disconnects etc.
> > It's a
> > > > > case by case.
> > > > >
> > > > > > Jobs are currently only used by the block layer and called "block
> > jobs",
> > > > > > but the idea is to generalize this.  They use synchronous QMP +
> > events.
> > > > >
> > > > > That pattern will have the flaws I mentioned (empty return, broadcast
> > > > > events, id conflict, qapi semantic & documentation etc). Something
> > new can
> > > > > be invented, but it will likely make the protocol more complicated
> > > > > compared to the solution I proposed (which is optional btw, and
> > gracefully
> > > > > fallbacks to sync processing for clients that do not support the
> > async qmp
> > > > > capability). However, I believe the job interface could be built on
> > top of
> > > > > what I propose.
> > > > >
> > > > > > Jobs are more heavy-weight than async QMP commands, but
> > pause/resume,
> > > > > > rate-limiting, progress reporting, robust reconnect, etc are
> > important
> > > > > > features.  Users want to be aware of long-running operations and
> > have
> > > > > > the ability to control them.
> > > > >
> > > > > You can't generalize such job interface to all async commands. Some
> > may not
> > > > > implement the ability to report progress, to cancel, to pause etc,
> > etc. In
> > > > > the end, it will be complicated and unneeded in many cases (what's
> > the use
> > > > > case to pause or to get the progress of a screendump?). What I
> > propose is
> > > > > simpler and compatible with job/task interfaces appropriate for
> > various
> > > > > domains.
> > > > >
> > > > > > I suspect that if we transition synchronous QMP commands to async
> > we'll
> > > > > > soon have requirements for progress reporting, pause/resume, etc.
> > So is
> > > > > > there a set of commands that should be async and others that
> > should be
> > > > > > jobs or should everything just be a job?
> > > > >
> > > > > Hard to say without a concrete proposal of what "job" is. Likely,
> > > > > everything is not going to be a "job".
> > > > >
> > > > > But hopefully qmp-async and jobs can co-exist and benefit from each
> > other.
> > > >
> > > > My concern with this series is that background operations must be
> > > > observable and there must be a way to cancel them.  Otherwise
> > management
> > > > tools cannot do their job and it's hard to troubleshoot a misbehaving
> > > > system because you can't answer the question "what's going on?".  Once
> > > > you add that then a large chunk of block jobs is duplicated.
> > >
> > > Tracking ongoing operations can also be done at management layer. If
> > needed, we could add qmp-commands to list on-going commands (their ids
> > etc), and add commands to cancel them. But then again, not all operations
> > will be cancellable, and I am not sure having requirements to list or
> > cancel or modify all on-going operation is needed (I would say no, just
> > like today you can't do anything while a command is running)
> >
> > It cannot be done by robustly by the client.  If the client crashes then
> > there's no way of knowing what pending commands are running.  Requiring
> > the client to keep a journal would force every client that wants to be
> > robust and easy to troubleshoot to duplicate this and IMO isn't a
> > solution.
> >
> >
> My proposal allows for commands to be cancelled when the client is gone.
> And we can quite easily provide a qmp command to list on-going commands, I
> can add a patch for that.
> 
> There is no per-client context as of today, so recovering from on-going job
> would conflict with other clients (there is no per client id or job-id
> namespace neither). I don't know if there is a way to enforce only a single
> qmp client today, I would have to check.
> 
> QEMU knows which commands are in-flight, it should be able to report
> > this info.  It's important for troubleshooting.
> >
> > I agree that it's not important today since only one command runs at a
> > time (except block jobs and migration, which do have commands to query
> > their status).  But the nature of async commands means that they can run
> > in the background for a long time, so it will be necessary.
> >
> >
> If needed, it can be added with this proposal. I will add a
> proof-of-concept patch in the next iteration.

Great.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]