qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: completion timeouts with pin-based interrupts in QEMU hw/nvme


From: Guenter Roeck
Subject: Re: completion timeouts with pin-based interrupts in QEMU hw/nvme
Date: Thu, 12 Jan 2023 16:27:18 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 1/12/23 11:27, Keith Busch wrote:
On Thu, Jan 12, 2023 at 06:45:55PM +0100, Klaus Jensen wrote:
On Jan 12 09:34, Keith Busch wrote:
On Thu, Jan 12, 2023 at 02:10:51PM +0100, Klaus Jensen wrote:

The pin-based interrupt logic in hw/nvme seems sound enough to me, so I
am wondering if there is something going on with the kernel driver (but
I certainly do not rule out that hw/nvme is at fault here, since
pin-based interrupts has also been a source of several issues in the
past).

Does it work if you change the pci_irq_assert() back to pci_irq_pulse()?
While probably not the "correct" thing to do, it has better results in
my testing.


A simple s/pci_irq_assert/pci_irq_pulse broke the device. However,

        diff --git i/hw/nvme/ctrl.c w/hw/nvme/ctrl.c
        index 03760ddeae8c..0fc46dcb9ec4 100644
        --- i/hw/nvme/ctrl.c
        +++ w/hw/nvme/ctrl.c
        @@ -477,6 +477,7 @@ static void nvme_irq_check(NvmeCtrl *n)
                 return;
             }
             if (~intms & n->irq_status) {
        +        pci_irq_deassert(&n->parent_obj);
                 pci_irq_assert(&n->parent_obj);
             } else {
                 pci_irq_deassert(&n->parent_obj);


seems to do the trick (pulse is the other way around, assert, then
deassert).

Probably not the "correct" thing to do, but I'll take it since it seems
to fix it. On a simple boot loop I got the timeout about 1 out of 5. I'm
on ~20 runs now and have not encountered it.

I'll see if I can set up a mips rootfs and test that. Guenter, what MIPS
machine/board(s) are you testing?

Could you try the below?


This works as well: 50 iterations with no failures after applying the patch
below on top of qemu v7.2.0. Tested with qemu-system-mipsel.

Guenter

---
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 2c85de4700..521c3c80c1 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -558,6 +558,20 @@ static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq)
      }
  }
+static void nvme_irq_pulse(NvmeCtrl *n, NvmeCQueue *cq)
+{
+    if (!cq->irq_enabled) {
+        return;
+    }
+
+    if (msix_enabled(&(n->parent_obj))) {
+        msix_notify(&(n->parent_obj), cq->vector);
+        return;
+    }
+
+    pci_irq_pulse(&n->parent_obj);
+}
+
  static void nvme_req_clear(NvmeRequest *req)
  {
      req->ns = NULL;
@@ -6917,6 +6931,12 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, 
int val)
              }
nvme_irq_deassert(n, cq);
+        } else {
+            /*
+             * Retrigger the irq just to make sure the host has no excuse for
+             * not knowing there's more work to complete on this CQ.
+             */
+            nvme_irq_pulse(n, cq);
          }
      } else {
          /* Submission queue doorbell write */
--




reply via email to

[Prev in Thread] Current Thread [Next in Thread]