Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy

From:	Anthony Liguori
Subject:	Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Date:	Thu, 24 Feb 2011 11:58:30 -0600
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10

On 02/24/2011 09:22 AM, Avi Kivity wrote:

On 02/24/2011 05:00 PM, Anthony Liguori wrote:
On 02/24/2011 02:54 AM, Avi Kivity wrote:
On 02/23/2011 10:18 PM, Anthony Liguori wrote:
Then the management stack has to worry about yet another way ofinteracting via qemu.
{ 'StateItem': { 'key': 'str', 'value': 'str' } }
{ 'StateSection': { 'kind': 'str', 'name': 'str', 'items': ['StateItem' ] } }
{ 'StateInfo': { 'sections': [ 'StateSection' ] } }

{ 'query-state', {}, {}, 'StateInfo' }
A management tool never need to worry about anything other thanthis command if it so chooses. If we have the pre-machine initmode for 0.16, then this can even be used to inspect state withoutrunning a guest.
So we have yet another information tree. If we store the cd-romeject state here, then we need to make an association between thedevice path of the cd-rom, and the StateItem key.
And this linkage is key.

Let's say I launch QEMU with:

qemu -cdrom ~/foo.img

And then in the monitor, I do:

(qemu) eject ide1-cd0
The question is, what command can I now use to launch the same qemuinstance?
When I think of stateful config, what I really think of is a way tospit out a command line that essentially becomes, "this is how younow launch QEMU".
In this case, it would be:

qemu -cdrom ~/foo.img -device ide-disk,id=ide1-cd0,drive=

Or, we could think of this in terms of:

qemu -cdrom ~/foo.img -readconfig foo.cfg

Where foo.cfg contained:

[device "ide1-cd0"]
driver="ide-disk"
drive=""
So what I'm really suggesting is that we generate foo.cfg whenevermonitor commands do things that change the command line and introducea new option to reflect this, IOW:
qemu -cdrom ~/foo.img -config foo.cfg
If you move the cdrom to a different IDE channel, you have to updatethe stateful non-config file.
Whereas if you do

   $ qemu-img create -f cd-tray -b ~/foo.img ~/foo-media-tray.img
   $ qemu -cdrom ~/foo-media-tray.img

the cd-rom tray state will be tracked in the image file.

Yeah, but how do you move it? If you do a remove/add through QMP, thenthe config file will reflect things just fine.

If you want to do it outside of QEMU, then you can just ignore theconfig file and manage all of the state yourself. But it's never goingto work as well (it will be racy) and you're pushing a tremendous amountof knowledge that ultimately belongs in QEMU (what state needs topersist) to something that isn't QEMU which means it's probably notgoing to be done correctly.

I know you're a big fan of the omnipotent management tool but myexperience has been that we need to help the management tooling folksmore by expecting less from them.

Far better to store it in the device itself. For example, we couldmake a layered block format driver that stores the eject state and a"backing file" containing the actual media. Eject and media changewould be recorded in the block format driver's state. You couldthen hot-unplug a USB cd-writer and hot-plug it back into adifferent guest, implementing a virtual sneakernet.
I think you're far too hung up on "store it in the device itself".The recipe to create the device model is not intrinsic to the devicemodel. It's an independent thing that's a combination of the commandline arguments and any executed monitor commands.
Maybe a better way to think about the stateful config file is amechanism to replay the monitor history.
Again the question is who is the authoritative source of theconfiguration. Is it the management tool or is it qemu?

QEMU. No question about it. At any point in time, we are theauthoritative source of what the guest's configuration is. There's nodoubt about it. A management tool can try to keep up with us, butultimately we are the only ones that know for sure.

We have all of this information internally. Just persisting it is not amajor architectural change. It's something we should have been doing(arguably) from the very beginning.

The management tool already has to keep track of (the optional partsof) the guest device tree. It cannot start reading the statefulnon-config file at random points in time. So all that is left is theguest controlled portions of the device tree, which are pretty rare,and random events like live-copy migration. I think that introducinga new authoritative source of information will create a lot of problems.

QEMU has always been the authoritative source. Nothing new has beenintroduced. We never persisted the machine's configuration which meantmanagement tools had to try to aggressively keep up with us which isintrinsically error prone. Fixing this will only improve existingmanagement tools.

The fact that the state is visible in the filesystem is animplementation detail.
A detail that has to be catered for by the management stack - it hasto provide a safe place for it, back it up, etc.
If it cares for QEMU to preserve state. Today, this all gets thrownaway.
Right, but we should make it easy, not hard.

Yeah, I fail to see how this makes it hard. We conveniently are saying,hey, this is all the state that needs to be persisted. We'll persist itfor you if you want, otherwise, we'll expose it in a central location.

If the tool wants to ignore it and guess based on various combinationsof other commands, more power to it.

It doesn't work for eject unless you interpose an acknowledgedevent. Ultimately, this is a simple problem. If you wantreliability, we either need symmetric RPCs so that the device modelcan call (and wait) to the management layer to acknowledge a changeor QEMU can post an event to the management layer, and maintain thestate in a reliable fashion.
I don't see why it doesn't work.  Please explain.
1) guest eject
2) qemu posts eject event
3) qemu acknowledges eject to the guest
4) management tool sees eject event and updates guest config
There's a race between 3 & 4. It can only be addressed byinterposing 4 between 2 and 3 OR making qemu persist this statebetween 2 and 3 such that the management tool can reliably query it.
If "it" is my cd-rom tray block format driver, it works. It's reallythe same in action as the stateful non-config, except it's part of thedevice/image, not a central location.

Because you've introduced a one-off. Having a bunch of one-offs(especially being a bunch of new block formats!) is not going to makethings simpler for management tools.

I disagree. Storing NVRAM as a disk image is a simple extension ofexisting management tools. Block live-copy and cd-rom eject statealso make sense as per-image state if you take hotunplug and hotpluginto account.
Everything can be stored in a block driver but when the data ishighly structured, isn't it nice to expose it in a structured, humanreadable way? I know I'd personally prefer a text representation ofCMOS than a binary blob.
Have a tool expose it.  Part of the range is unspecified anyway.


I guess we need to agree to disagree then.

Using a block format driver means that we don't have to care about acrash during a write, that we can snapshot it, etc.

Why? We always need to care about a crash during write. What I've beenthinking for a config file is the class approach of using a ~ and .#file to make sure that we write out the new file and then atomicallyrename it to get the new contents. Yeah, it's a bit heavy weight butthis shouldn't be a very common thing to update.

Device settings should be stored with the devices, not with qemu.
Suppose we take the cold-plug on startup via the monitor approach.So we start with a bare machine, cold plug stuff into it. Now qemuhas to reconcile the stateful non-config file with the hardware.What if something has changed? A device moved into a different slot?
Sorry, I'm confused. Is there anything in the stateful config filewhen we start up? If so, the act of starting up will add a bunch ofhardware.
Suppose it has information about ide1-cd0's media tray. Now werestart qemu and cold-plug the cdrom into ide0-cd0. What happens tothe information?

Whether media is present is not a property of a blockdev, it's aproperty of a device. What does it even mean to use your media-trayformat with something like a CMOS device?

Technically, mac address is stored on eeprom and we store that as adevice property today. We can't persist device properties eventhough you can change the mac address of a network card and it doespersist across reboots. Are you advocating that we introduce aneeprom for every network card (all in a slightly different format)and have special tools to manipulate the eeprom to store and view themac address?
Yes -- if we really want to support it. Obviously we have to storethe entire eeprom, not just the portion containing the MAC address, soit's not just a key/value store. A card may even have its firmware inflash.

I think that's overengineering. I think we can go very far by justpersisting small amounts of information in a central location. We'renot building a cycle-accurate simulator here afterall.


Regards,

Anthony Liguori

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, (continued)

Prev by Date: [Qemu-devel] [PATCH 11/58] vmstate: port adb_mouse
Next by Date: Re: [Qemu-devel] [PATCH v3 00/16] vnc: adapative tight, zrle, zywrle, and bitmap module
Previous by thread: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Next by thread: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Index(es):
- Date
- Thread