qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] vfio: Align iova also to IOMMU page size


From: Alex Williamson
Subject: Re: [Qemu-devel] [PATCH] vfio: Align iova also to IOMMU page size
Date: Tue, 08 Dec 2015 16:42:38 -0700

On Mon, 2015-12-07 at 11:20 +0000, Peter Maydell wrote:
> On 7 December 2015 at 10:53, Pavel Fedin <address@hidden> wrote:
> >> TAGET_PAGE_ALIGN tells us that it *could* be a valid DMA target though.
> >> The VM model is capable of using that as a page size, which means we
> >> assume it is and want to generate a fault.
> >
> >  We seem to have looped back. So...
> >  It is possible to fix this according to this assumption. In this
> > case we would need to make TARGET_PAGE_BITS a variable. If we are
> > emulating ancient armv5te, it will be set to 10. For modern targets,
> > ARMv6 and newer, it will be 12.
> 
> You can't just make TARGET_PAGE_BITS a variable, it is used as a compile
> time constant in a bunch of TCG internal stuff. It would be nice
> if we didn't require it to be compile time, but it would be a lot of
> work to fix (especially if you want to avoid it being a performance
> hit).
> 
> In any case, that still doesn't fix the problem. On an AArch64
> target CPU, TARGET_PAGE_BITS still has to be 12 (for a 4K
> minimum page size), but the guest and host could still be using
> 64K pages. So your VFIO code *must* be able to deal with the
> situation where TARGET_PAGE_BITS is smaller than any alignment
> that the guest, host or IOMMU need to care about.
> 
> I still think the VFIO code needs to figure out what alignment
> it actually cares about and find some way to determine what
> that is, or alternatively if the relevant alignment is not
> possible to determine, write the code so that it doesn't
> need to care. Either way, TARGET_PAGE_ALIGN is not the answer.

Ok, let's work our way down through the relevant page sizes, host,
IOMMU, and target.

The host page size is relevant because this is the granularity with
which the kernel can pin pages.  Every IOMMU mapping must be backed by a
pinned page in the current model since we don't really have hardware to
support IOMMU page faults.

The IOMMU page size defines the granularity with which we can map IOVA
to physical memory.  The IOMMU may support multiple page sizes, but what
we're really talking about here is the minimum page size.

The target page size is relevant because this defines the minimum
possible page size used within the VM.  We presume that anything less
than TARGET_PAGE_ALIGN cannot be referenced as a page by the VM CPU and
therefore is probably not allocated as a DMA buffer for a driver running
within the guest.

An implementation detail here is that the vfio type1 IOMMU model
currently exposes the host page size as the minimum IOMMU page size.
The reason for this is to simplify page accounting, if we don't allow
sub-host page mappings we don't need per page reference counting.  This
can be fixed within the current API, but kernel changes are required or
else locked page requirements due to over-counting become a problem.
The benefit though is that this abstracts the host page size from QEMU.

So let's take the easy scenario first, if target page size is greater
than or equal to the minimum IOMMU page size, we're golden.  We can map
anything that could be a target DMA buffer.  This leads to the current
situation that we simply ignore any ranges which disappear when we align
to the target page size.  It can't be a DMA buffer, ignore it.  Note
that the 64k host, 4k target problem goes away if type1 accounting is
fixed to allow IOMMU granularity mapping, since I think in the cases we
care about the IOMMU still supports 4k pages, otherwise...

Then we come to the scenario here, where target page size is less than
the minimum IOMMU page size.  The current code is intentionally trying
to trigger the vfio type1 error that this cannot be mapped.  To resolve
this, QEMU needs to decide if it's ok to provide the device with DMA
access to everything on that IOMMU granularity page, ensure that aliases
mapping the same IOMMU page are consistent and handle the reference
counting for those sub-mappings to avoid duplicate mappings and
premature unmaps.

So I think in the end, the one page size we care about is the minimum
IOMMU granularity.  We don't really care about the target page size at
all and maybe we only care about the host page size for determining what
might share a page with a sub-page mapping.  However, there's work to
get there (QEMU, kernel, or both depending on the specific config) and
the target page size trick has so far been a useful simplification.
Thanks,

Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]