qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 3/3] IOMMU: Integrate between VFIO and vIOMMU


From: Alex Williamson
Subject: Re: [Qemu-devel] [PATCH v3 3/3] IOMMU: Integrate between VFIO and vIOMMU to support device assignment
Date: Mon, 6 Jun 2016 11:30:24 -0600

On Mon, 6 Jun 2016 15:38:25 +0800
Peter Xu <address@hidden> wrote:

> Some questions not quite related to this patch content but vfio...
> 
> On Mon, May 23, 2016 at 11:53:42AM -0600, Alex Williamson wrote:
> > On Sat, 21 May 2016 19:19:50 +0300
> > "Aviv B.D" <address@hidden> wrote:  
> 
> [...]
> 
> > > +#if 0
> > >  static hwaddr vfio_container_granularity(VFIOContainer *container)
> > >  {
> > >      return (hwaddr)1 << ctz64(container->iova_pgsizes);
> > >  }
> > > -
> > > +#endif  
> 
> Here we are fetching the smallest page size that host IOMMU support,
> so even if host IOMMU support large pages, it will not be used as long
> as guest enabled vIOMMU, right?

Not using this replay mechanism, correct.  AFAIK, this replay code has
only been tested on POWER where the window is much, much smaller than
the 64bit address space and hugepages are not supported.  A replay
callback into the iommu could could not only walk the address space
more efficiently, but also attempt to map with hugepages.  It would
however need to be cautious not to coalesce separate mappings by the
guest into a single mapping through vfio, or else we're going to have
inconsistency for mapping vs unmapping that vfio does not expect or
support.
 
> > 
> > 
> > Clearly this is unacceptable, the code has a purpose.
> >   
> > >  static void vfio_listener_region_add(MemoryListener *listener,
> > >                                       MemoryRegionSection *section)
> > >  {
> > > @@ -384,11 +387,13 @@ static void vfio_listener_region_add(MemoryListener 
> > > *listener,
> > >          giommu->n.notify = vfio_iommu_map_notify;
> > >          QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
> > >  
> > > +        vtd_register_giommu(giommu);  
> > 
> > vfio will not assume VT-d, this is why we register the notifier below.
> >   
> > >          memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
> > > +#if 0
> > >          memory_region_iommu_replay(giommu->iommu, &giommu->n,
> > >                                     vfio_container_granularity(container),
> > >                                     false);  
> 
> For memory_region_iommu_replay(), we are using
> vfio_container_granularity() as the granularity, which is the host
> IOMMU page size. However inside it:
> 
> void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
>                                 hwaddr granularity, bool is_write)
> {
>     hwaddr addr;
>     IOMMUTLBEntry iotlb;
> 
>     for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
>         iotlb = mr->iommu_ops->translate(mr, addr, is_write);
>         if (iotlb.perm != IOMMU_NONE) {
>             n->notify(n, &iotlb);
>         }
> 
>         /* if (2^64 - MR size) < granularity, it's possible to get an
>          * infinite loop here.  This should catch such a wraparound */
>         if ((addr + granularity) < addr) {
>             break;
>         }
>     }
> }
> 
> Is it possible that iotlb mapped to a large page (or any page that is
> not the same as granularity)? The above code should have assumed that
> host/guest IOMMU are having the same page size == granularity?

I think this is answered above.  This is not remotely efficient code
for a real 64bit IOMMU (BTW, VT-d does not support the full 64bit
address space either, I believe it's more like 48bits) and is not going
to replay hugepages, but it will give us sufficiently correct IOMMU
entries... eventually. Thanks,

Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]