qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] [Qemu-devel] [PATCH] intel_iommu: handle invalid ce fo


From: Jason Wang
Subject: Re: [Qemu-stable] [Qemu-devel] [PATCH] intel_iommu: handle invalid ce for shadow sync
Date: Mon, 8 Oct 2018 11:08:31 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1



On 2018年09月13日 15:55, Peter Xu wrote:
There are two callers for vtd_sync_shadow_page_table_range(), one
provided a valid context entry and one not.  Move that fetching
operation into the caller vtd_sync_shadow_page_table() where we need to
fetch the context entry.

Meanwhile, we should handle VTD_FR_CONTEXT_ENTRY_P properly when
synchronizing shadow page tables.  Having invalid context entry there is
perfectly valid when we move a device out of an existing domain.  When
that happens, instead of posting an error we invalidate the whole region.

Without this patch, QEMU will crash if we do these steps:

(1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
(2) bind the NICs with vfio-pci in the guest
(3) start testpmd with the NICs applied
(4) stop testpmd
(5) rebind the NIC back to ixgbe kernel driver

The patch should fix it.

Reported-by: Pei Zhang <address@hidden>
Tested-by: Pei Zhang <address@hidden>
CC: Pei Zhang <address@hidden>
CC: Alex Williamson <address@hidden>
CC: Jason Wang <address@hidden>
CC: Maxime Coquelin <address@hidden>
CC: Michael S. Tsirkin <address@hidden>
CC: QEMU Stable <address@hidden>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
Signed-off-by: Peter Xu <address@hidden>
---
  hw/i386/intel_iommu.c | 54 ++++++++++++++++++++++++++-----------------
  1 file changed, 33 insertions(+), 21 deletions(-)

Reviewed-by: Jason Wang <address@hidden>

Some nits, see below.


diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3dfada19a6..2509520d6f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -37,6 +37,8 @@
  #include "kvm_i386.h"
  #include "trace.h"
+static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
+
  static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
                              uint64_t wmask, uint64_t w1cmask)
  {
@@ -1047,39 +1049,49 @@ static int 
vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as,
          .notify_unmap = true,
          .aw = s->aw_bits,
          .as = vtd_as,
+        .domain_id = VTD_CONTEXT_ENTRY_DID(ce->hi),
      };
-    VTDContextEntry ce_cache;
+
+    return vtd_page_walk(ce, addr, addr + size, &info);
+}
+
+static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
+{
      int ret;
+    VTDContextEntry ce;
+    IOMMUNotifier *n;
- if (ce) {
-        /* If the caller provided context entry, use it */
-        ce_cache = *ce;
-    } else {
-        /* If the caller didn't provide ce, try to fetch */
-        ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
-                                       vtd_as->devfn, &ce_cache);
-        if (ret) {
+    ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
+                                   pci_bus_num(vtd_as->bus),
+                                   vtd_as->devfn, &ce);
+    if (ret) {
+        if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
+            /*
+             * It's a valid scenario to have a context entry that is
+             * not present.  For example, when a device is removed
+             * from an existing domain then the context entry will be
+             * zeroed by the guest before it was put into another
+             * domain.  When this happens, instead of synchronizing
+             * the shadow pages we should invalidate all existing
+             * mappings and notify the backends.
+             */
+            IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
+                vtd_address_space_unmap(vtd_as, n);
+            }
+        } else {
              /*
               * This should not really happen, but in case it happens,
               * we just skip the sync for this time.  After all we even
               * don't have the root table pointer!
               */

It looks to me the comment is not accurate, no root pointer is not the only reason for the failure of vtd_dev_to_context_entry().

              error_report_once("%s: invalid context entry for bus 0x%x"
-                              " devfn 0x%x",
-                              __func__, pci_bus_num(vtd_as->bus),
-                              vtd_as->devfn);
-            return 0;

I'm not quite sure error_report_once() is really needed here since all failures has been traced.

+                              " devfn 0x%x", __func__,
+                              pci_bus_num(vtd_as->bus), vtd_as->devfn);
          }
+        return 0;
      }
- info.domain_id = VTD_CONTEXT_ENTRY_DID(ce_cache.hi);
-
-    return vtd_page_walk(&ce_cache, addr, addr + size, &info);
-}
-
-static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
-{
-    return vtd_sync_shadow_page_table_range(vtd_as, NULL, 0, UINT64_MAX);
+    return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
  }

As has been discussed, this will left addr UINT64_MAX, it's better to have [start, end] instead of (start, range).

Thanks

/*




reply via email to

[Prev in Thread] Current Thread [Next in Thread]