Re: [Qemu-devel] [RFC] More robust migration

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] More robust migration

From:	Anthony Liguori
Subject:	Re: [Qemu-devel] [RFC] More robust migration
Date:	Fri, 20 Feb 2009 12:27:30 -0600
User-agent:	Thunderbird 2.0.0.19 (X11/20090105)

Jamie Lokier wrote:

Anthony Liguori wrote:
2. Introduce a length field to the header of each device.
IMHO, this would reduce robustness. It's also difficult because of theway savevm registration works. You don't know how large a section isuntil it's written and migration streams are not seekable.
The way HTTP deals with not knowing the size in advance is is to split
data into chunks, each chunk the size of a small write buffer, and a
chunk size is written in front of each one.  This allows storing
sections of binary data whose size isn't known in advance, but still
safely skip them.
This would allow to skip unknown (or unwanted) devices.
No good can come from this. If you have an unknown section, you mustthrow and error and stop the migration. What if this is for a devicethat the guest is interacting with? The device just disappears aftermigration? All savevm state is state that affects the functionality ofa guest. Throwing away this state will change the functionality of theVM and migration should not affect guest functionality.
What if you're migrating from a snapshot made on a host with some
pass-through USB device to another host which cannot provide the same
device.  In that case I'd like the option for the guest to see the
device has disappeared.  Maybe it's stopped working (HPET), or maybe
it's unplugged (anything hot unpluggable).

Stop working is IMHO unacceptable. Devices that support hot plugging,you can hot unplug and *then* perform the migration.

In general, hot unplugging requires guest cooperation FWIW. Bad thingswill often happen if you just yank a USB cable out of your computer.

That's preferable to not being able to use the snapshot at all,
effectively having to trash it.

I disagree. Something that is broken in an unknown way is not betterthan having something gracefully fail. If you do hardware pass through,forget about snapshotting/migration/etc.

What are the use cases where you think this would be beneficial? Ireally see the change in semantics from the old way (throwing awayunknown sections) to the new way (requiring strict versioning andvalidating all sections) as being a huge step toward robustness.
I've been upset at a "savevm" which I wrote with some past version of
QEMU that I couldn't load in a later version.  It wasn't obvious why,
just that it refused. And I didn't have the old version, or even know
which the old version was.  And even if I could have reconstructed the
old QEMU - I wanted to migrate to a newer version.  It's no fun having
to reconstruct a carefully primed guest snapshot test state from its
reboot, if that can be avoided.

Device configuration files will go a long way to upgrading. Sometimesyou have to blacklist older versions of devices because there were bugsin the save/restore functions. In that case, there's really nothing wecan do. Your snapshot was invalid.

My primary goal for migration is robustness. I do not think it's a goodidea to support any circumstances that could introduce changes in guestvisible state during a live migration.
What about safe hotpluggable devices?

Make your changes in the guest to allow safe unplug, then unplug, thenmigrate.


Regards,

Anthony Liguori

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC] More robust migration, (continued)
- [Qemu-devel] Re: [RFC] More robust migration, Charles Duffy, 2009/02/20
  - Re: [Qemu-devel] Re: [RFC] More robust migration, Jamie Lokier, 2009/02/22

Prev by Date: Re: [Qemu-devel] Machine description as data prototype, take 3
Next by Date: Re: [Qemu-devel] [PATCH] exec.c remove unnecessary operators on functions
Previous by thread: Re: [Qemu-devel] [RFC] More robust migration
Next by thread: [Qemu-devel] Re: [RFC] More robust migration
Index(es):
- Date
- Thread