|
From: | Anthony Liguori |
Subject: | Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests |
Date: | Mon, 30 Jan 2012 07:54:56 -0600 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Lightning/1.0b2 Thunderbird/3.1.15 |
On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori<address@hidden> wrote:On 01/26/2012 01:35 PM, Luiz Capitulino wrote:On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<address@hidden> wrote:[adding qemu-devel] On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*. So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea. Because of what written above, we need to call set-level on every suspend.IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.Qemu folks, would you care to chime in on this? Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0? The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem. The fundamental problem is that, S3 in current (and old) qemu has two known bugs: 1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this) We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered.
It's a management tool problem.Before a management tool issues a command, it should query the existence of the command to determine whether this version of QEMU has that capability. If the tool needs to use two commands, it should query the existence of both of them.
In this case, the management tool needs a qemu-ga command *and* a QEMU command (to resume from suspend) so it should query both of them.
Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 worked in QEMU as expected.
Alternatively, if there really was no reason to have a resume-from-suspend command, this would be the point where we would add a capabilities command adding the "working-s3" capability.
But with capabilities, this is a direct QEMU->management tool interaction, not a proxy through the guest agent.
We shouldn't trust the guest agent and we certainly don't want to rely on the guest agent to avoid sending an improper command to QEMU! That would be a security issue.
We need a way for qemu-ga to query qemu about the existence of a working S3 support. The set-support-level solves that.
qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to ask the guest to do something. If we need to interact with QEMU directly to query a capability and/or presence of a command, then we should talk to QEMU directly.
To put it another way, a management tool MUST deal with the fact that when issuing the suspend-to-ram command, a guest may ignore it or attempt to do something malicious.
Regards, Anthony Liguori
Another option would be to disable (or enable) S3 by default in qemu-ga, and let the admin enable (or disable it) according to S3 support being fixed in qemu.
[Prev in Thread] | Current Thread | [Next in Thread] |