qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support


From: Alex Williamson
Subject: Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support
Date: Mon, 31 Mar 2014 13:59:46 -0600

On Fri, 2014-03-28 at 16:12 +1100, Alexey Kardashevskiy wrote:
> On 03/20/2014 04:25 PM, David Gibson wrote:
> > On Wed, Mar 19, 2014 at 01:57:41PM -0600, Alex Williamson wrote:
> >> On Wed, 2014-03-12 at 16:52 +1100, Alexey Kardashevskiy wrote:
> >>> From: David Gibson <address@hidden>
> > [snip]
> >>> +    if (!memory_region_is_ram(mr)) {
> >>> +        DPRINTF("iommu map to non memory area %"HWADDR_PRIx"\n",
> >>> +                xlat);
> >>> +        return;
> >>> +    }
> >>> +    if (len & iotlb->addr_mask) {
> >>> +        DPRINTF("iommu has granularity incompatible with target AS\n");
> >>
> >> Is this possible?  Assuming len is initially a power-of-2, would the
> >> translate function change it?  Maybe worth a comment to explain.
> > 
> > translate can absolutely change the length.  It will generally
> > truncate it to the IOMMU page size, in fact.
> > 
> > [snip]
> >>> +        DPRINTF("region_add [iommu] %"HWADDR_PRIx" - %"HWADDR_PRIx"\n",
> >>> +                iova, int128_get64(int128_sub(llend, int128_one())));
> >>> +        /*
> >>> +         * FIXME: We should do some checking to see if the
> >>> +         * capabilities of the host VFIO IOMMU are adequate to model
> >>> +         * the guest IOMMU
> >>> +         *
> >>> +         * FIXME: This assumes that the guest IOMMU is empty of
> >>> +         * mappings at this point - we should either enforce this, or
> >>> +         * loop through existing mappings to map them into VFIO.
> >>> +         *
> >>> +         * FIXME: For VFIO iommu types which have KVM acceleration to
> >>> +         * avoid bouncing all map/unmaps through qemu this way, this
> >>> +         * would be the right place to wire that up (tell the KVM
> >>> +         * device emulation the VFIO iommu handles to use).
> >>> +         */
> >>
> >> That's a lot of FIXMEs...  The second one in particular looks like it
> >> needs to expand a bit on why this is likely a valid assumption.  The
> >> last one is more of a TODO than a FIXME.
> > 
> > I think #2 isn't a valid assumption in general.  It was true for the
> > situation I was testing at the time, due to the order of pseries
> > initialization, so I left it to get a proof of concept reasonably
> > quickly.
> > 
> > But I think that one's a FIXME that actually needs to be fixed.
> 
> 
> But how?
> 
> At the moment, for SPAPR, the table gets cleaned when group is attached to
> a container (VFIO_GROUP_SET_CONTAINER ioctl) which happens right before
> registering the memory listener so we are fine (at least for SPAPR).
> Is that true for x86 or we need something more?
> 
> "loop through existing mappings to map them into VFIO" - this I do not
> really understand. We do not track mapping in QEMU so we cannot really loop
> through them.

Making registration of a guest IOMMU more like a MemoryListener would
solve this, the infrastructure should replay existing mappings on
startups and clear them on shutdown, then we could also get rid of the
FIXME on the region_del path.

> > [snip]
> >>> +        /*
> >>> +         * FIXME: We assume the one big unmap below is adequate to
> >>> +         * remove any individual page mappings in the IOMMU which
> >>> +         * might have been copied into VFIO.  That may not be true for
> >>> +         * all IOMMU types
> >>> +         */
> >>
> >> We assume this because the IOVA that gets unmapped is the same
> >> regardless of whether a guest IOMMU is present?
> > 
> > Uh.. no.  This assumption works for a page table based IOMMU where a
> > big unmap just flattens a large range of IO-PTEs. 
> 
> 
> Sorry for my poor english but how exactly is that different from what Alex
> said? IOVA is a PCI bus address between dma_window_start and
> dma_window_start+dma_window_size.

I think David is trying to describe that the IOMMU must be able to unmap
a sparse sub-region of a larger unmap, while I'm trying to make sure
there's no IOVA translation that might interfere with using the
"standard" unmap path rather than the guest IOMMU unmap path.  Thanks,

Alex

> > It might not work
> > for some kind of extent or TLB based IOMMU, where operations are
> > expected to exactly match the addresses of map operations.
> > 
> > I don't know if IOMMUs that have trouble with this are a realistic
> > prospect, but they're at least a theoretical possibility, hence the
> > comment.
> > 
> >>
> >>> +    }
> >>> +
> >>>      iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> >>>      end = (section->offset_within_address_space + 
> >>> int128_get64(section->size)) &
> >>>            TARGET_PAGE_MASK;
> >>
> >>
> >>
> > 
> 
> 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]