qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v3] VFIO Migration


From: Dr. David Alan Gilbert
Subject: Re: [RFC v3] VFIO Migration
Date: Tue, 24 Nov 2020 17:24:58 +0000
User-agent: Mutt/1.14.6 (2020-07-11)

* Alex Williamson (alex.williamson@redhat.com) wrote:
> On Mon, 16 Nov 2020 14:52:26 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Mon, 16 Nov 2020 11:02:51 +0000
> > Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > 
> > > On Wed, Nov 11, 2020 at 04:35:43PM +0100, Cornelia Huck wrote:  
> > > > On Wed, 11 Nov 2020 15:14:49 +0000
> > > > Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >     
> > > > > On Wed, Nov 11, 2020 at 12:48:53PM +0100, Cornelia Huck wrote:    
> > > > > > On Tue, 10 Nov 2020 13:14:04 -0700
> > > > > > Alex Williamson <alex.williamson@redhat.com> wrote:      
> > > > > > > On Tue, 10 Nov 2020 09:53:49 +0000
> > > > > > > Stefan Hajnoczi <stefanha@redhat.com> wrote:      
> > > > > >       
> > > > > > > > Device models supported by an mdev driver and their details can 
> > > > > > > > be read from
> > > > > > > > the migration_info.json attr. Each mdev type supports one 
> > > > > > > > device model. If a
> > > > > > > > parent device supports multiple device models then each device 
> > > > > > > > model has an
> > > > > > > > mdev type. There may be multiple mdev types for a single device 
> > > > > > > > model when they
> > > > > > > > offer different migration parameters such as resource capacity 
> > > > > > > > or feature
> > > > > > > > availability.
> > > > > > > > 
> > > > > > > > For example, a graphics card that supports 4 GB and 8 GB device 
> > > > > > > > instances would
> > > > > > > > provide gfx-4GB and gfx-8GB mdev types with memory=4096 and 
> > > > > > > > memory=8192
> > > > > > > > migration parameters, respectively.        
> > > > > > > 
> > > > > > > 
> > > > > > > I think this example could be expanded for clarity.  I think this 
> > > > > > > is
> > > > > > > suggesting we have mdev_types of gfx-4GB and gfx-8GB, which each
> > > > > > > implement some common device model, ie. com.gfx/GPU, where the
> > > > > > > migration parameter 'memory' for each defaults to a value 
> > > > > > > matching the
> > > > > > > type name.  But it seems like this can also lead to some 
> > > > > > > combinatorial
> > > > > > > challenges for management tools if these parameters are writable. 
> > > > > > >  For
> > > > > > > example, should a management tool create a gfx-4GB device and 
> > > > > > > change to
> > > > > > > memory parameter to 8192 or a gfx-8GB device with the default 
> > > > > > > parameter?      
> > > > > > 
> > > > > > I would expect that the mdev types need to match in the first place.
> > > > > > What role would the memory= parameter play, then? Allowing gfx-4GB 
> > > > > > to
> > > > > > have memory=8192 feels wrong to me.      
> > > > > 
> > > > > Yes, I expected these mdev types to only accept a fixed "memory" 
> > > > > value,
> > > > > but there's nothing stopping a driver author from making "memory" 
> > > > > accept
> > > > > any value.    
> > > > 
> > > > I'm wondering how useful the memory parameter is, then. The layer
> > > > checking for compatibility can filter out inconsistent settings, but
> > > > why would we need to express something that is already implied in the
> > > > mdev type separately?    
> > > 
> > > To avoid tying device instances to specific mdev types. An mdev type is
> > > a device implementation, but the goal is to enable migration between
> > > device implementations (new/old or completely different
> > > implementations).
> > > 
> > > Imagine a new physical device that now offers variable memory because
> > > users found the static mdev types too constraining.  How do you migrate
> > > back and forth between new and old physical devices if the migration
> > > parameters don't describe the memory size? Migration parameters make it
> > > possible. Without them the management tool needs to hard-code knowledge
> > > of specific mdev types that support migration.  
> > 
> > But doesn't the management tool *still* need to keep hardcoded
> > information about what the value of that memory parameter was for an
> > existing mdev type? If we have gfx-variable with a memory parameter,
> > fine; but if the target is supposed to accept a gfx-4GB device, it
> > should simply instantiate a gfx-4GB device.
> > 
> > I'm getting a bit worried about the complexity of the checking that
> > management software is supposed to perform. Is it really that bad to
> > restrict the models to a few, well-defined ones? Especially in the mdev
> > case, where we have control about what is getting instantiated?
> 
> This is exactly what I was noting with the combinatorial challenges of
> the management tool.  If a vendor chooses to use a generic base device
> model which they modify with parameters to match an assortment of mdev
> types, then management tools will need to match every mdev type
> implementing that device model to determine if compatible parameters
> exist.  OTOH, the vendor could choose to create a device model that
> specifically describes a single configuration of known parameters.
> 
> For example, mdev type gfx-4GB might be a device model com.gfx/GPU with
> a fixed memory parameter of 4GB or it could be a device model
> com.gfx/GPU-4G with no additional parameter.  The hard part is when the
> vendor offers an mdev type gfx-varGB with device model com.gfx/GPU and
> available memory options of 1GB, 2GB, 4GB, 8GB.  At that point a
> management tool might decide to create a gfx-varGB device instance and
> tune the memory parameter or create a gfx-4GB instance, either would be
> correct and we've expressed no preference for one or the other.  Thanks,

What you've described here is exactly what happens with QEMU/libvirts
confusion of CPU models.  Both QEMU and Libvirt have their idea of what
a named CPU model means and then add/subtract flags to get what they
want.
When libvirt wants a CPU model that doesn't quite match what it has
(e.g. a host-compatibility thing where the host is a CPU it didn't know)
it's heuristics to either start from above and remove things or start
from below and add them.

Dave

> Alex
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]