qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V8 06/39] cpr: reboot mode


From: Daniel P . Berrangé
Subject: Re: [PATCH V8 06/39] cpr: reboot mode
Date: Thu, 16 Jun 2022 12:10:11 +0100
User-agent: Mutt/2.2.1 (2022-02-19)

On Wed, Jun 15, 2022 at 07:51:53AM -0700, Steve Sistare wrote:
> Provide the cpr-save and cpr-load functions for live update.  These save and
> restore VM state, with minimal guest pause time, so that qemu may be updated
> to a new version in between.
> 
> cpr-save stops the VM and saves vmstate to an ordinary file.  It supports
> any type of guest image and block device, but the caller must not modify
> guest block devices between cpr-save and cpr-load.
> 
> cpr-save supports several modes, the first of which is reboot. In this mode
> the caller invokes cpr-save and then terminates qemu.  The caller may then
> update the host kernel and system software and reboot.  The caller resumes
> the guest by running qemu with the same arguments as the original process
> and invoking cpr-load.  To use this mode, guest ram must be mapped to a
> persistent shared memory file such as /dev/dax0.0 or /dev/shm PKRAM.
> 
> The reboot mode supports vfio devices if the caller first suspends the
> guest, such as by issuing guest-suspend-ram to the qemu guest agent.  The
> guest drivers' suspend methods flush outstanding requests and re-initialize
> the devices, and thus there is no device state to save and restore.
> 
> cpr-load loads state from the file.  If the VM was running at cpr-save time
> then VM execution resumes.  If the VM was suspended at cpr-save time, then
> the caller must issue a system_wakeup command to resume.
> 
> cpr-save syntax:
>   { 'enum': 'CprMode', 'data': [ 'reboot' ] }
>   { 'command': 'cpr-save', 'data': { 'filename': 'str', 'mode': 'CprMode' }}
> 
> cpr-load syntax:
>   { 'command': 'cpr-load', 'data': { 'filename': 'str', 'mode': 'CprMode' }}

I'm still a little unsure if this direction for QAPI exposure is the
best, or whether we should instead leverage the migration commands.

I particularly concerned that we might regret having an API that
is designed only around storage in local files/blockdevs. The
migration layer has flexibility to use many protocols which has
been useful in the past to be able to offload work to an external
process. For example, libvirt uses migrate-to-fd so it can use
a helper that adds O_DIRECT support such that we avoid trashing
the host I/O cache for save/restore.

At the same time though, the migrate APIs don't currently support
a plain "file" protocol. This was because historically we needed
the QEMUFile to support O_NONBLOCK and this fails with plain
files or block devices, so QEMU threads could get blocked. For
the save side this doesn't matter so much, as QEMU now has the
outgoing migrate channels in blocking mode, only the incoming
side use non-blocking.  We could add a plain "file" protocol
to migration if we clearly document its limitations, and indeed
I've suggested we do that for another unrelated bit of work
for libvirts VM save/restore functionality.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]