qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 17/20] vfio/common: Support device dirty page tracking wit


From: Jason Gunthorpe
Subject: Re: [PATCH v2 17/20] vfio/common: Support device dirty page tracking with vIOMMU
Date: Thu, 23 Feb 2023 19:26:00 -0400

On Thu, Feb 23, 2023 at 03:33:09PM -0700, Alex Williamson wrote:
> On Thu, 23 Feb 2023 16:55:54 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Thu, Feb 23, 2023 at 01:06:33PM -0700, Alex Williamson wrote:
> > > > #2 is the presumption that the guest is using an identity map.  
> > > 
> > > This is a dangerous assumption.
> > >   
> > > > > I'd think the only viable fallback if the vIOMMU doesn't report its 
> > > > > max
> > > > > IOVA is the full 64-bit address space, otherwise it seems like we need
> > > > > to add a migration blocker.    
> > > > 
> > > > This is basically saying vIOMMU doesn't work with migration, and we've
> > > > heard that this isn't OK. There are cases where vIOMMU is on but the
> > > > guest always uses identity maps. eg for virtual interrupt remapping.  
> > > 
> > > Yes, the vIOMMU can be automatically added to a VM when we exceed 255
> > > vCPUs, but I don't see how we can therefore deduce anything about the
> > > usage mode of the vIOMMU.    
> > 
> > We just loose optimizations. Any mappings that are established outside
> > the dirty tracking range are permanently dirty. So at worst the guest
> > can block migration by establishing bad mappings. It is not exactly
> > production quality but it is still useful for a closed environment
> > with known guest configurations.
> 
> That doesn't seem to be what happens in this series, 

Seems like something is missed then

> nor does it really make sense to me that userspace would simply
> decide to truncate the dirty tracking ranges array.

Who else would do it?

> > No, 2**64 is too big a number to be reasonable.
> 
> So what are the actual restrictions were dealing with here?  I think it
> would help us collaborate on a solution if we didn't have these device
> specific restrictions sprinkled through the base implementation.

Hmm? It was always like this, the driver gets to decide if it accepts
the proprosed tracking ranges or not. Given how the implementation has
to work there is no device that could do 2**64...

At least for mlx5 it is in the multi-TB range. Enough for physical
memory on any real server.

> > Ideally we'd work it the other way and tell the vIOMMU that the vHW
> > only supports a limited number of address bits for the translation, eg
> > through the ACPI tables. Then the dirty tracking could safely cover
> > the larger of all system memory or the limited IOVA address space.
> 
> Why can't we do that?  Hotplug is an obvious issue, but maybe it's not
> vHW telling the vIOMMU a restriction, maybe it's a QEMU machine or
> vIOMMU option and if it's not set to something the device can support,
> migration is blocked.

I don't know, maybe we should if we can.

> > Or even better figure out how to get interrupt remapping without IOMMU
> > support :\
> 
> -machine q35,default_bus_bypass_iommu=on,kernel-irqchip=split \
> -device intel-iommu,caching-mode=on,intremap=on

Joao?

If this works lets just block migration if the vIOMMU is turned on..

Jason



reply via email to

[Prev in Thread] Current Thread [Next in Thread]