qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 5/5] vifo: introduce new VFIO ioctl VFIO_DEVICE_PC


From: Tian, Kevin
Subject: Re: [Qemu-devel] [RFC 5/5] vifo: introduce new VFIO ioctl VFIO_DEVICE_PCI_GET_DIRTY_BITMAP
Date: Fri, 7 Jul 2017 06:40:58 +0000

> From: Alex Williamson [mailto:address@hidden
> Sent: Saturday, July 1, 2017 1:00 AM
> 
> On Fri, 30 Jun 2017 05:14:40 +0000
> "Tian, Kevin" <address@hidden> wrote:
> 
> > > From: Alex Williamson [mailto:address@hidden
> > > Sent: Friday, June 30, 2017 4:57 AM
> > >
> > > On Thu, 29 Jun 2017 00:10:59 +0000
> > > "Tian, Kevin" <address@hidden> wrote:
> > >
> > > > > From: Alex Williamson [mailto:address@hidden
> > > > > Sent: Thursday, June 29, 2017 12:00 AM
> > > > > Thanks Kevin.  So really it's not really a dirty bitmap, it's just a
> > > > > bitmap of pages that the device has access to and may have dirtied.
> > > > > Don't we have this more generally in the vfio type1 IOMMU backend?
> For
> > > > > a mediated device, we know all the pages that the vendor driver has
> > > > > asked to be pinned.  Should we perhaps make this interface on the
> vfio
> > > > > container rather than the device?  Any mediated device can provide
> this
> > > > > level of detail without specific vendor support.  If we had DMA page
> > > > > faulting, this would be the natural place to put it as well, so maybe
> > > > > we should design the interface there to support everything similarly.
> > > > > Thanks,
> > > > >
> > > >
> > > > That's a nice idea. Just two comments:
> > > >
> > > > 1) If some mediated device has its own way to construct true dirty
> > > > bitmap (not thru DMA page faulting), the interface is better designed
> > > > to allow that flexibility. Maybe an optional callback if not registered
> > > > then use common type1 IOMMU logic otherwise prefers to vendor
> > > > specific callback
> > >
> > > I'm not sure what that looks like, but I agree with the idea.  Could
> > > the pages that type1 knows about every be anything other than a
> > > superset of the dirty pages?  Perhaps a device ioctl to flush unused
> > > mappings would be sufficient.
> >
> > sorry I didn't quite get your idea here. My understanding is that
> > type1 is OK as an alternative in case mediated device has no way
> > to track dirtied pages (as for Intel GPU), so we can use type1 pinned
> > pages as an indirect way to indicate dirtied pages. But if mediated
> > device has its own way (e.g. a device private MMU) to track dirty
> > pages, then we should allow that device to provide dirty bitmap
> > instead of using type1.
> 
> My thought was that our current mdev iommu interface allows the vendor
> driver to pin specific pages.  In order for the mdev device to dirty a
> page, we need for it to be pinned.  Therefore at worst, the set of
> pages pinned in type1 is the superset of all pages that can potentially
> be dirtied by the device.  In the worst case, this devolves to all
> pages mapped through the iommu in the case of direct assigned devices.
> My assertion is therefore that a device specific dirty page bitmap can
> only be a subset of the type1 pinned pages.  Therefore if the mdev

this assertion is correct.

> vendor driver can flush any stale pinnings, then the type1 view off
> pinned pages should match the devices view of the current working set.
> Then we wouldn't need a device specific dirty bitmap, we'd only need a
> mechanism to trigger a flush of stale mappings on the device.

what's your definition of 'stale pinnings"? take Intel vGPU for example,
we pin pages when they are mapped to GPU page table and then unpin
them when unmapped from GPU page table. Based on this policy type1
view of pinned pages should just match device view.  There is no 'stale' 
page example which I can think of since all currently-pinned pages must
be pinned for various read/write purpose. 

but I do realize one limitation of this option now. Given there are
both physical device and mdev assigned to the same guest, we
have to pin all pages for physically assigned device. In that case we
may need away to differentiate which device a given pinned page
is for. I'm not sure whether it's an easy thing or require lots of change.
interface wise we don't need additional one since there is still
unpin request from mdev path. Just internal structure needs to 
be extended to maintain that ownership info.

> 
> Otherwise I'm not sure how we cleanly create an interface where the
> dirty bitmap can either come from the device or the container... but
> I'd welcome suggestions.  Thanks,
> 

My original thought is simple:

- by default type1 can give dirty bitmap
- when mdev is created, mdev driver has chance to claim whether
it would like to track its own dirty bitmap
- then when Qemu requests dirty bitmap of mdev, vfio check the
flag to decide whether return type1 or call into mdev driver

Thanks
kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]