qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH qemu v14 01/18] memory: Fix IOMMU replay base ad


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] [PATCH qemu v14 01/18] memory: Fix IOMMU replay base address
Date: Tue, 22 Mar 2016 18:19:40 +1100
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0

On 03/22/2016 03:59 PM, David Gibson wrote:
On Tue, Mar 22, 2016 at 03:28:52PM +1100, Alexey Kardashevskiy wrote:
On 03/22/2016 02:26 PM, David Gibson wrote:
On Tue, Mar 22, 2016 at 02:12:30PM +1100, Alexey Kardashevskiy wrote:
On 03/22/2016 11:49 AM, David Gibson wrote:
On Mon, Mar 21, 2016 at 06:46:49PM +1100, Alexey Kardashevskiy wrote:
Since a788f227 "memory: Allow replay of IOMMU mapping notifications"
when new VFIO listener is added, all existing IOMMU mappings are
replayed. However there is a problem that the base address of
an IOMMU memory region (IOMMU MR) is ignored which is not a problem
for the existing user (which is pseries) with its default 32bit DMA
window starting at 0 but it is if there is another DMA window.

This stores the IOMMU's offset_within_address_space and adjusts
the IOVA before calling vfio_dma_map/vfio_dma_unmap.

As the IOMMU notifier expects IOVA offset rather than the absolute
address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before
calling notifier(s).

Signed-off-by: Alexey Kardashevskiy <address@hidden>
Reviewed-by: David Gibson <address@hidden>

On a closer look, I realised this still isn't quite correct, although
I don't think any cases which would break it exist or are planned.

---
  hw/ppc/spapr_iommu.c          |  2 +-
  hw/vfio/common.c              | 14 ++++++++------
  include/hw/vfio/vfio-common.h |  1 +
  3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 7dd4588..277f289 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -277,7 +277,7 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, 
target_ulong ioba,
      tcet->table[index] = tce;

      entry.target_as = &address_space_memory,
-    entry.iova = ioba & page_mask;
+    entry.iova = (ioba - tcet->bus_offset) & page_mask;
      entry.translated_addr = tce & page_mask;
      entry.addr_mask = ~page_mask;
      entry.perm = spapr_tce_iommu_access_flags(tce);

This bit's right/

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index fb588d8..d45e2db 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -257,14 +257,14 @@ static void vfio_iommu_map_notify(Notifier *n, void *data)
      VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
      VFIOContainer *container = giommu->container;
      IOMMUTLBEntry *iotlb = data;
+    hwaddr iova = iotlb->iova + giommu->offset_within_address_space;

This bit might be right, depending on how you define 
giommu->offset_within_address_space.

      MemoryRegion *mr;
      hwaddr xlat;
      hwaddr len = iotlb->addr_mask + 1;
      void *vaddr;
      int ret;

-    trace_vfio_iommu_map_notify(iotlb->iova,
-                                iotlb->iova + iotlb->addr_mask);
+    trace_vfio_iommu_map_notify(iova, iova + iotlb->addr_mask);

      /*
       * The IOMMU TLB entry we have just covers translation through
@@ -291,21 +291,21 @@ static void vfio_iommu_map_notify(Notifier *n, void *data)

      if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
          vaddr = memory_region_get_ram_ptr(mr) + xlat;
-        ret = vfio_dma_map(container, iotlb->iova,
+        ret = vfio_dma_map(container, iova,
                             iotlb->addr_mask + 1, vaddr,
                             !(iotlb->perm & IOMMU_WO) || mr->readonly);
          if (ret) {
              error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
                           "0x%"HWADDR_PRIx", %p) = %d (%m)",
-                         container, iotlb->iova,
+                         container, iova,
                           iotlb->addr_mask + 1, vaddr, ret);
          }
      } else {
-        ret = vfio_dma_unmap(container, iotlb->iova, iotlb->addr_mask + 1);
+        ret = vfio_dma_unmap(container, iova, iotlb->addr_mask + 1);
          if (ret) {
              error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
                           "0x%"HWADDR_PRIx") = %d (%m)",
-                         container, iotlb->iova,
+                         container, iova,
                           iotlb->addr_mask + 1, ret);
          }
      }

This is fine.

@@ -377,6 +377,8 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
           */
          giommu = g_malloc0(sizeof(*giommu));
          giommu->iommu = section->mr;
+        giommu->offset_within_address_space =
+            section->offset_within_address_space;

But here there's a problem.  The iova in IOMMUTLBEntry is relative to
the IOMMU MemoryRegion, but - at least in theory - only a subsection
of that MemoryRegion could be mapped into the AddressSpace.

But the IOMMU MR stays the same - size, offset, and iova will be relative to
its start, why does it matter if only portion is mapped?

Because the portion mapped may not sit at the start of the MR.  For
example if you had a 2G MR, and the second half is mapped at address 0
in the AS,

My imagination fails here. How could you do this in practice?

address_space_init(&as, &root)
memory_region_init(&mr, 2GB)
memory_region_add_subregion(&root, -1GB, &mr)

But offsets are unsigned.

In general, how to map only a half, what memory_region_add_xxx()
does that?

I'm not totally sure, but I think you can do it with:


Ok. Got it. So, how about this:

s/offset_within_address_space/iommu_offset/

and

giommu->iommu_offset = section->offset_within_address_space -
section->offset_within_region;

?



address_space_init(&as, &root)
memory_region_init(&mr0, 2GB)
memory_region_init_alias(&mr1, &mr0, 1GB, 1GB)
memory_region_add_subregion(&root, 0, &mr1)

But the point is that it's possible for offset_within_region to be
non-zero, and if it is, you need to take it into account.

I was not arguing this, I was trying to imagine :)



--
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]