[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VFIO Migration
From: |
Alex Williamson |
Subject: |
Re: VFIO Migration |
Date: |
Tue, 3 Nov 2020 10:31:35 -0700 |
On Tue, 3 Nov 2020 15:33:56 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Tue, Nov 03, 2020 at 04:23:43PM +0100, Christophe de Dinechin wrote:
> >
> > On 2020-11-02 at 12:11 CET, Stefan Hajnoczi wrote...
> > > There is discussion about VFIO migration in the "Re: Out-of-Process
> > > Device Emulation session at KVM Forum 2020" thread. The current status
> > > is that Kirti proposed a VFIO device region type for saving and loading
> > > device state. There is currently no guidance on migrating between
> > > different device versions or device implementations from different
> > > vendors. This is known to be non-trivial and raised discussion about
> > > whether it should really be handled by VFIO or centralized in QEMU.
> > >
> > > Below is a document that describes how to ensure migration compatibility
> > > in VFIO. It does not require changes to the VFIO migration interface. It
> > > can be used for both VFIO/mdev kernel devices and vfio-user devices.
> > >
> > > The idea is that the device state blob is opaque to the VMM but the same
> > > level of migration compatibility that exists today is still available.
> > >
> > > I hope this will help us reach consensus and let us discuss specifics.
> > >
> > > If you followed the previous discussion, I changed the approach from
> > > sending a magic constant in the device state blob to identifying device
> > > models by URIs. Therefore the device state structure does not need to be
> > > defined here - the critical information for ensuring device migration
> > > compatibility is the device model and configuration defined below.
> > >
> > > Stefan
> > > ---
> > > VFIO Migration
> > > ==============
> > > This document describes how to save and load VFIO device states. Saving a
> > > device state produces a snapshot of a VFIO device's state that can be
> > > loaded
> > > again at a later point in time to resume the device from the snapshot.
> > >
> > > The data representation of the device state is outside the scope of this
> > > document.
> > >
> > > Overview
> > > --------
> > > The purpose of device states is to save the device at a point in time and
> > > then
> > > restore the device back to the saved state later. This is more
> > > challenging than
> > > it first appears.
> > >
> > > The process of saving a device state and loading it later is called
> > > *migration*. The state may be loaded by the same device that saved it or
> > > by a
> > > new instance of the device, possibly running on a different computer.
> > >
> > > It must be possible to migrate to a newer implementation of the device
> > > as well as to an older implementation of the device. This allows users
> > > to upgrade and roll back their systems.
> > >
> > > Migration can fail if loading the device state is not possible. It should
> > > fail
> > > early with a clear error message. It must not appear to complete but
> > > leave the
> > > device inoperable due to a migration problem.
> > >
> > > The rest of this document describes how these requirements can be met.
> > >
> > > Device Models
> > > -------------
> > > Devices have a *hardware interface* consisting of hardware registers,
> > > interrupts, and so on.
> > >
> > > The hardware interface together with the device state representation is
> > > called
> > > a *device model*. Device models can be assigned URIs such as
> > > https://qemu.org/devices/e1000e to uniquely identify them.
> >
> > Like others, I think we should either
> >
> > a) Give a relatively strong requirement regarding what is at the URL in
> > question, e.g. docs, maybe even a machine-readable schema describing
> > configuration and state for the device. Leaving the option "there can be
> > nothing here" is IMO asking for trouble.
> >
> > b) simply call that a unique ID, and then either drop the https: entirely or
> > use something else, like pci:// or, to be more specific, vfio://
> >
> > I'd favor option (b) for a different practical reason. URLs are subject to
> > redirection and other mishaps. For example, using https:// begs the question
> > whether
> > https://qemu.org/devices/e1000e and
> > https://www.qemu.org/devices/e1000e
> > should be treated as the same device. I believe that your intent is that
> > they shouldn't, but if the qemu web server redirects to www, and someone
> > wants to copy-paste their web browser's URL bar to the command line, they'd
> > get the wrong one.
>
> That's not a real world problem IMHO, because neither of these URLs
> ever need resolve to a real webpage, and thus not need to be cut +
> paste from a browser.
>
> They are simply expressing a resource identifier using a URI as a
> convenient format. This is the same as an XML namespace using a URI,
> and rarely, if ever, resolving to any actual web page.
>
> This is a good thing, because if you say there needs to be a real page
> there, then it creates a pile of corporate beaurocracy for contributors.
> I can freely create a URI under https://redhat.com for purposes of being
> a identifier, but I cannot get any content published there without jumping
> through many tedious corporate approvals and stand a good chance of being
> rejected.
>
> If we're truely treating the URIs as an opaque string, we don't especially
> need to define any rules other than to say it should be under a domain that
> you have authority over either directly, or via membership of a project
> that delegates. We can suggest "https" since seeing "http" is a red flag
> for many people these days.
Hmm, an opaque string, sort of like the existing "name" attribute we
have now where Christophe quoted some examples in his message. Thanks,
Alex
- Re: VFIO Migration, (continued)
- Re: VFIO Migration, Stefan Hajnoczi, 2020/11/04
- Re: VFIO Migration, Dr. David Alan Gilbert, 2020/11/04
- Re: VFIO Migration, Stefan Hajnoczi, 2020/11/05
- Re: VFIO Migration, Dr. David Alan Gilbert, 2020/11/05
- Re: VFIO Migration, Michael S. Tsirkin, 2020/11/05
- Re: VFIO Migration, Dr. David Alan Gilbert, 2020/11/05
- Re: VFIO Migration, Michael S. Tsirkin, 2020/11/05
- Re: VFIO Migration, Christophe de Dinechin, 2020/11/04
Re: VFIO Migration, Christophe de Dinechin, 2020/11/03
Re: VFIO Migration, Stefan Hajnoczi, 2020/11/04
Re: VFIO Migration, Michael S. Tsirkin, 2020/11/04