qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
Date: Tue, 26 Apr 2016 15:34:26 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Mon, Apr 25, 2016 at 09:24:12AM +0200, Jan Kiszka wrote:
> On 2016-04-25 09:18, Peter Xu wrote:
> > On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
> >> On 2016-04-19 10:38, Peter Xu wrote:
> > 
> > [...]
> > 
> >>> By default, IR is disabled to be better compatible with current
> >>> QEMU. To enable IR, we can using the following command to boot a
> >>> IR-supported VM with virtio-net device with vhost (still do not
> >>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
> >>> here):
> >>>
> >>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
> >>
> >> "intr" sounds a bit too much like "interrupt", not "interrupt
> >> remapping". Why not use the kernel's form, "intremap"?
> > 
> > Sure. It sounds nice to be aligned with the kernel one. Let me take
> > it in v5.
> > 
> >>
> >>>      -enable-kvm -m 1024 \
> >>>    -netdev tap,id=net0,vhost=on \
> >>>      -device virtio-net-pci,netdev=user.0 \
> >>>      -monitor telnet::3333,server,nowait \
> >>>    /var/lib/libvirt/images/vm1.qcow2
> >>>
> >>> When guest boots, we can verify whether IR enabled by grepping the
> >>> dmesg like:
> >>>
> >>> address@hidden ~]# journalctl -k | grep "DMAR-IR"
> >>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under 
> >>> DRHD base  0xfed90000 IOMMU 0
> >>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ 
> >>> remapping in xapic mode
> >>>
> >>> Currently supported devices:
> >>>
> >>> - Emulated/Splitted irqchip
> >>> - Generic PCI Devices
> >>> - vhost devices
> >>> - pass through device support? Not tested, but suppose it should work.
> >>
> >> I've tested this series against my Jailhouse setup, and it works pretty
> >> well! Actually considering to move my test setup over this branch.
> > 
> > This is really encouraging feedback! Btw, thanks for all kinds of
> > help on this patchset. :-)
> > 
> >>
> >> However, split irqchip still has some issues: When I boot a q35 machine
> >> with Linux, the e1000 network adapter only gets a single IRQ delivered.
> >> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
> >> didn't debug this in details yet.
> > 
> > I reproduced this problem. It seems that it fails even with
> > kernel-irqchip=off. Will try to dig it out.
> 
> Very good. Hope it can be easily fixed.

Hi, Jan,

The above issue should be caused by EOI missing of level-triggered
interrupts. Before that, I was always using edge-triggered
interrupts for test, so didn't encounter this one. Would you please
help try below patch? It can be applied directly onto the series,
and should solve the issue (it works on my test vm, and I'll take it
in v5 as well if it also works for you):

-------------------------

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index b41ab89..de6a8cf 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int 
size)
     return val;
 }

+/*
+ * This is to satisfy the hack in Linux kernel. One hack of it is to
+ * simulate clearing the Remote IRR bit of IOAPIC entry using the
+ * following:
+ *
+ * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
+ * Otherwise, we simulate the EOI message manually by changing the trigger
+ * mode to edge and then back to level, with RTE being masked during
+ * this."
+ *
+ * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
+ *
+ * This is based on the assumption that, Remote IRR bit will be
+ * cleared by IOAPIC hardware for edge-triggered interrupts (I
+ * believe that's what the IOAPIC version 0x1X hardware does). So
+ * if we are emulating it, we'd better do it the same here, so that
+ * the guest kernel hack will work as well on QEMU.
+ *
+ * Without this, level-triggered interrupts in IR mode might fail to
+ * work correctly.
+ */
+static inline void
+ioapic_fix_edge_remote_irr(uint64_t *entry)
+{
+    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
+        /* Level triggered interrupts, make sure remote IRR is zero */
+        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
+    }
+}
+
 static void
 ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
                  unsigned int size)
@@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
                     s->ioredtbl[index] &= ~0xffffffffULL;
                     s->ioredtbl[index] |= val;
                 }
+                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);
                 ioapic_service(s);
             }
         }

------------------------

I am still looking into guest part codes. Although the above patch
should solve the issue, there are still issues in guest codes when
IR is enabled:

- mismatched "vector" in IOAPIC entry and IRTE entry (this is
  required in vt-d spec 5.1.5.1, and required to correctly deliver
  EOI broadcast I guess). See intel_irq_remapping_prepare_irte():

        ...
        /*
         * IO-APIC RTE will be configured with virtual vector.
         * irq handler will do the explicit EOI to the io-apic.
         */
        entry->vector   = info->ioapic_pin;
        ...

- I encountered that level-triggered entries in IOAPIC is marked as
  edge-triggered interrupt in APIC (which is strange)... This will
  also affect correct delivery of EOI broadcast. I still need time
  to figure out why.

If EOI broadcast can work, e1000 issue would be solved as
well even without above patch.

[...]

> > 
> >>
> >>> - IR fault reporting
> >>
> >> Would be welcome! I found a "test case" yesterday: misconfigured IOAPIC
> >> ID blocked its IRQs under Jailhouse, and I first had to enable tracing
> >> to realize it ;).
> > 
> > Yes, it sounds nice to have guest side feedback on IR faults. Will
> > do more reading, and see whether I can add one more patch in v5 to
> > do this.
> 
> It's not a must-have for getting things merged. In fact, any additional
> feature that could now delay the merge of what you have should rather
> wait. Stabilizing, addressing style and structure comments is more
> important IMO.

Okay, then let me add this into my todo list, and will pick this up
when got time.

Thanks,

-- peterx



reply via email to

[Prev in Thread] Current Thread [Next in Thread]