qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VFIO Migration


From: Daniel P . Berrangé
Subject: Re: VFIO Migration
Date: Tue, 3 Nov 2020 15:33:56 +0000
User-agent: Mutt/1.14.6 (2020-07-11)

On Tue, Nov 03, 2020 at 04:23:43PM +0100, Christophe de Dinechin wrote:
> 
> On 2020-11-02 at 12:11 CET, Stefan Hajnoczi wrote...
> > There is discussion about VFIO migration in the "Re: Out-of-Process
> > Device Emulation session at KVM Forum 2020" thread. The current status
> > is that Kirti proposed a VFIO device region type for saving and loading
> > device state. There is currently no guidance on migrating between
> > different device versions or device implementations from different
> > vendors. This is known to be non-trivial and raised discussion about
> > whether it should really be handled by VFIO or centralized in QEMU.
> >
> > Below is a document that describes how to ensure migration compatibility
> > in VFIO. It does not require changes to the VFIO migration interface. It
> > can be used for both VFIO/mdev kernel devices and vfio-user devices.
> >
> > The idea is that the device state blob is opaque to the VMM but the same
> > level of migration compatibility that exists today is still available.
> >
> > I hope this will help us reach consensus and let us discuss specifics.
> >
> > If you followed the previous discussion, I changed the approach from
> > sending a magic constant in the device state blob to identifying device
> > models by URIs. Therefore the device state structure does not need to be
> > defined here - the critical information for ensuring device migration
> > compatibility is the device model and configuration defined below.
> >
> > Stefan
> > ---
> > VFIO Migration
> > ==============
> > This document describes how to save and load VFIO device states. Saving a
> > device state produces a snapshot of a VFIO device's state that can be loaded
> > again at a later point in time to resume the device from the snapshot.
> >
> > The data representation of the device state is outside the scope of this
> > document.
> >
> > Overview
> > --------
> > The purpose of device states is to save the device at a point in time and 
> > then
> > restore the device back to the saved state later. This is more challenging 
> > than
> > it first appears.
> >
> > The process of saving a device state and loading it later is called
> > *migration*. The state may be loaded by the same device that saved it or by 
> > a
> > new instance of the device, possibly running on a different computer.
> >
> > It must be possible to migrate to a newer implementation of the device
> > as well as to an older implementation of the device. This allows users
> > to upgrade and roll back their systems.
> >
> > Migration can fail if loading the device state is not possible. It should 
> > fail
> > early with a clear error message. It must not appear to complete but leave 
> > the
> > device inoperable due to a migration problem.
> >
> > The rest of this document describes how these requirements can be met.
> >
> > Device Models
> > -------------
> > Devices have a *hardware interface* consisting of hardware registers,
> > interrupts, and so on.
> >
> > The hardware interface together with the device state representation is 
> > called
> > a *device model*. Device models can be assigned URIs such as
> > https://qemu.org/devices/e1000e to uniquely identify them.
> 
> Like others, I think we should either
> 
> a) Give a relatively strong requirement regarding what is at the URL in
> question, e.g. docs, maybe even a machine-readable schema describing
> configuration and state for the device. Leaving the option "there can be
> nothing here" is IMO asking for trouble.
> 
> b) simply call that a unique ID, and then either drop the https: entirely or
> use something else, like pci:// or, to be more specific, vfio://
> 
> I'd favor option (b) for a different practical reason. URLs are subject to
> redirection and other mishaps. For example, using https:// begs the question
> whether
> https://qemu.org/devices/e1000e and
> https://www.qemu.org/devices/e1000e
> should be treated as the same device. I believe that your intent is that
> they shouldn't, but if the qemu web server redirects to www, and someone
> wants to copy-paste their web browser's URL bar to the command line, they'd
> get the wrong one.

That's not a real world problem IMHO, because neither of these URLs
ever need resolve to a real webpage, and thus not need to be cut +
paste from a browser.

They are simply expressing a resource identifier using a URI as a
convenient format. This is the same as an XML namespace using a URI,
and rarely, if ever, resolving to any actual web page.

This is a good thing, because if you say there needs to be a real page
there, then it creates a pile of corporate beaurocracy for contributors.
I can freely create a URI under https://redhat.com for purposes of being
a identifier, but I cannot get any content published there without jumping
through many tedious corporate approvals and stand a good chance of being
rejected.

If we're truely treating the URIs as an opaque string, we don't especially
need to define any rules other than to say it should be under a domain that
you have authority over either directly, or via membership of a project
that delegates. We can suggest "https" since seeing "http" is a red flag
for many people these days.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]