Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command

From:	Michael Roth
Subject:	Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command
Date:	Thu, 05 Jan 2012 15:41:33 -0600
User-agent:	Mozilla/5.0 (X11; Linux i686 on x86_64; rv:8.0) Gecko/20111105 Thunderbird/8.0

On 01/05/2012 02:25 PM, Luiz Capitulino wrote:

On Thu, 05 Jan 2012 09:10:50 -0600
Michael Roth<address@hidden>  wrote:

On 01/05/2012 08:42 AM, Luiz Capitulino wrote:

On Thu, 5 Jan 2012 12:59:27 +0000
"Daniel P. Berrange"<address@hidden>   wrote:

On Thu, Jan 05, 2012 at 10:37:14AM -0200, Luiz Capitulino wrote:

On Thu, 5 Jan 2012 10:16:30 +0000
"Daniel P. Berrange"<address@hidden>   wrote:

On Wed, Jan 04, 2012 at 05:45:11PM -0200, Luiz Capitulino wrote:

This version drops modes 'sleep' and 'hybrid' because they don't work
properly due to issues in qemu. Only the 'hibernate' mode is supported
for now.


IMHO this is short-sighted. When the bugs QEMU in are fixed so that
these modes work, you have needlessly put users in the situation where
they have to now upgrade the guest agent everywhere to take advantage
of the bugfix.


That was my thinking until v4. But after discussing with Michael the issues
we have with S3 I concluded that it doesn't make sense to offer an API to
something that doesn't work, this will just generate bug reports. Also,
updating to get new features is normal and expected.


This is assuming that users will always upgrade their VMs&   hosts in
lock step, which I rather doubt they will in practice. eg imagine a
deployment might have a mixture of hosts, running QEMU 1.1 (broken S3)
and QEMU 1.2 (working S3). If they build VM disk images they will likely
use the QEMU GA from 1.2 for all their builds, even if many of them
will only run on QEMU 1.1 hosts. So you'll end up having 'sleep' and
'hybrid' commands available in the guest agent, even though the host
QEMU doesn't work properly.

So you *will* ultimately need to make sure that QEMU GA from 1.2, has
sensible behaviour when run on a QEMU 1.1 host.  If you don't address
this during 1.1, you may well find yourself in an un-winnable situation
for 1.2, where it is impossible to provide good behaviour on old hosts.

So IMHO we are better off in the long run, if we include all commands
right now, even though some don't work yet, and work to ensure we have
good error reporting behaviour for those that don't work.


Yes, I agree. As a side note: if we add error reporting it will only work
on 1.1 and later.  Ie, the problem you describe above will still happen
with 1.0.

But what you're suggesting seems to be the right thing to do. Do you agree
Michael?


Agree, but unless we add an RPC that QEMU uses to advertise
capabilities, I'm really not sure it's possible to detect whether or not
the host will support it.


You mean an RPC to advertise if 'sleep' is supported? I think this is best done
by making guest-suspend return an error as suggested by Daniel, otherwise a
client that doesn't query for capabilities might run in trouble.

Agreed, but what I mean is that if the user executes the suspend usingon up-level agent running on a down-level 1.0 host, the agent will stillsee s3 advertised and issue the buggy suspend. That's why I suggestedthe host->agent capabilities reporting as a possible (but somewhat ugly)way to just simply tell the agent it can handle it (and, lacking that,assume that it can't).


There's an important detail though: we need to make qemu not advertise S3 for
this to work. However, we might be able to fix S3 for 1.1 (and bugs, like the
S4 ones, can't be detected, limiting the scope of the 'unsupported' error).

So, we could merge all modes and commit to get S3 fixed for 1.1 :)

No disagreement there, if we can commit to making qemu-ga/qemu 1.1releases interoperable in this manner (whether by fixing s3 or notadvertising it), I think that approach is perfectly fine, ideal even.Doing a 1.1 release where qemu and qemu-ga are not interoperable (qemumissing s3 support, qemu-ga using s3) was my main objection.

But there is a 2nd topic here I'm trying to mull over: what is qemu-ga'ssupport policy for down-level hosts? backward-compatible? incompatible?

The above approach to this problem suggests the latter (qemu-ga 1.1 hasRPCs that will knowingly break 1.0 qemu instances). We could solve thisby introducing the capabilities negotiation I mentioned early. Itactually wouldn't need to be anything other than qemu telling qemu-gawhat qemu-ga version-level it supports. By default we assume 1.0, andlimit qemu-ga to that until qemu-ga is told otherwise (so, nosleep/hybrid suspend modes). For new RPCs we may be able to handle thisversion automatically, since we include qemu version levels for the RPCsin the schema. For functionality within an RPC (like sleep/hybridsuspend modes) we could use conditional code.

If we take that approach (maintaining backward-compatibility), we'd needto introduce that code in the agent now, and require qemu/libvirtexecute the guest-set-support-level RPC or whatever to access these 1.1features.

Technically, there's a required RPC qemu-ga clients need to executealready: guest-sync. It's required because we have no way to reliablydetect EOF over virtio-serial, and thus an agent may send stale data toa newly-connected qemu-ga client, so the client needs to do theguest-sync command to find the expected response and re-sync thestreams. We could roll the guest-set-support-level functionality intothat. Basically just add another field.

And if we can't detect that reliably, we're
better off leaving it out for now, because sleeping guests is not
obscure functionality, and accidentally nuking guests when a user sleeps
them (presumably because they want to retain their working state) is
much worse than telling a user to upgrade their agent, or not supported
or whatever.

As an example, if S3 is broken in current QEMU, then we should not be
advertizing S3 to the guest OS. This would allow 'pm-is-supported --suspend'
to return false, at which point the guest agent can send back a nice error
message 'Suspend is not supported on this host', instead of just having the
guest try to suspend&   hang or worse.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH 2/2] qemu-ga: Add the guest-suspend command, (continued)
- Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Daniel P. Berrange, 2012/01/05
  - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Luiz Capitulino, 2012/01/05
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Daniel P. Berrange, 2012/01/05
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Luiz Capitulino, 2012/01/05
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Michael Roth, 2012/01/05
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Luiz Capitulino, 2012/01/05
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Michael Roth <=
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Luiz Capitulino, 2012/01/06
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Michael Roth, 2012/01/06
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Michael Roth, 2012/01/05
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Daniel P. Berrange, 2012/01/05
    - Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command, Michael Roth, 2012/01/05

Prev by Date: [Qemu-devel] Fwd: oVirt FOSDEM sessions
Next by Date: [Qemu-devel] [PATCH] GuestAgent: PIDFILE remains when daemon start fails
Previous by thread: Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command
Next by thread: Re: [Qemu-devel] [PATCH v4 0/2]: qemu-ga: Add the guest-suspend command
Index(es):
- Date
- Thread