qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 15/19] vfio-user: handle device interrupts


From: Stefan Hajnoczi
Subject: Re: [PATCH v6 15/19] vfio-user: handle device interrupts
Date: Tue, 29 Mar 2022 15:24:39 +0100

On Sat, Mar 26, 2022 at 11:47:36PM +0000, Jag Raman wrote:
> 
> 
> > On Mar 7, 2022, at 5:24 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > 
> > On Thu, Feb 17, 2022 at 02:49:02AM -0500, Jagannathan Raman wrote:
> >> Forward remote device's interrupts to the guest
> >> 
> >> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> >> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> >> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> >> ---
> >> include/hw/pci/pci.h              |   6 ++
> >> include/hw/remote/vfio-user-obj.h |   6 ++
> >> hw/pci/msi.c                      |  13 +++-
> >> hw/pci/msix.c                     |  12 +++-
> >> hw/remote/machine.c               |  11 +--
> >> hw/remote/vfio-user-obj.c         | 107 ++++++++++++++++++++++++++++++
> >> stubs/vfio-user-obj.c             |   6 ++
> >> MAINTAINERS                       |   1 +
> >> hw/remote/trace-events            |   1 +
> >> stubs/meson.build                 |   1 +
> >> 10 files changed, 158 insertions(+), 6 deletions(-)
> >> create mode 100644 include/hw/remote/vfio-user-obj.h
> >> create mode 100644 stubs/vfio-user-obj.c
> >> 
> >> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> >> index c3f3c90473..d42d526a48 100644
> >> --- a/include/hw/pci/pci.h
> >> +++ b/include/hw/pci/pci.h
> >> @@ -129,6 +129,8 @@ typedef uint32_t PCIConfigReadFunc(PCIDevice *pci_dev,
> >> typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num,
> >>                                 pcibus_t addr, pcibus_t size, int type);
> >> typedef void PCIUnregisterFunc(PCIDevice *pci_dev);
> >> +typedef void PCIMSINotify(PCIDevice *pci_dev, unsigned vector);
> >> +typedef void PCIMSIxNotify(PCIDevice *pci_dev, unsigned vector);
> >> 
> >> typedef struct PCIIORegion {
> >>     pcibus_t addr; /* current PCI mapping address. -1 means not mapped */
> >> @@ -323,6 +325,10 @@ struct PCIDevice {
> >>     /* Space to store MSIX table & pending bit array */
> >>     uint8_t *msix_table;
> >>     uint8_t *msix_pba;
> >> +
> >> +    PCIMSINotify *msi_notify;
> >> +    PCIMSIxNotify *msix_notify;
> >> +
> >>     /* MemoryRegion container for msix exclusive BAR setup */
> >>     MemoryRegion msix_exclusive_bar;
> >>     /* Memory Regions for MSIX table and pending bit entries. */
> >> diff --git a/include/hw/remote/vfio-user-obj.h 
> >> b/include/hw/remote/vfio-user-obj.h
> >> new file mode 100644
> >> index 0000000000..87ab78b875
> >> --- /dev/null
> >> +++ b/include/hw/remote/vfio-user-obj.h
> >> @@ -0,0 +1,6 @@
> >> +#ifndef VFIO_USER_OBJ_H
> >> +#define VFIO_USER_OBJ_H
> >> +
> >> +void vfu_object_set_bus_irq(PCIBus *pci_bus);
> >> +
> >> +#endif
> >> diff --git a/hw/pci/msi.c b/hw/pci/msi.c
> >> index 47d2b0f33c..93f5e400cc 100644
> >> --- a/hw/pci/msi.c
> >> +++ b/hw/pci/msi.c
> >> @@ -51,6 +51,8 @@
> >>  */
> >> bool msi_nonbroken;
> >> 
> >> +static void pci_msi_notify(PCIDevice *dev, unsigned int vector);
> >> +
> >> /* If we get rid of cap allocator, we won't need this. */
> >> static inline uint8_t msi_cap_sizeof(uint16_t flags)
> >> {
> >> @@ -225,6 +227,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
> >>     dev->msi_cap = config_offset;
> >>     dev->cap_present |= QEMU_PCI_CAP_MSI;
> >> 
> >> +    dev->msi_notify = pci_msi_notify;
> > 
> > Are you sure it's correct to skip the msi_is_masked() logic? I think the
> > callback function should only override the behavior of
> > msi_send_message(), not the entire msi_notify() function.
> > 
> > The same applies to MSI-X.
> 
> Hi Stefan,
> 
> We noticed that the client is handling the masking and unmasking of MSIx
> interrupts.
> 
> Concerning MSIx, vfio_msix_vector_use() handles unmasking and
> vfio_msix_vector_release() handles masking operations. The server triggers
> an MSIx interrupt by signaling the eventfd associated with the vector. If the 
> vector
> is unmasked, the interrupt bypasses the client/QEMU and takes this
> path: “server -> KVM -> guest”. Whereas, if the vector is masked, it lands on 
> the
> client via: “server -> vfio_msi_interrupt()”. vfio_msi_interrupt() suppresses 
> the
> interrupt if the vector is masked. The use and release functions switch the 
> server’s
> eventfd between VFIOPCIDevice->VFIOMSIVector[i]->kvm_interrupt and
> VFIOPCIDevice->VFIOMSIVector[i]->interrupt using the
> VFIO_DEVICE_SET_IRQS message.
> 
> Concerning MSI, the server should check if the vector is unmasked before
> triggering. The server is not doing it presently, will update it. For some 
> reason,
> I had assumed that MSI handling is similar to MSIx in terms of masking - sorry
> about that. The masking and unmasking information for MSI is in the config 
> space
> registers, so the server should have this information.
> 
> You had previously suggested using callbacks for msi_get_message &
> msi_send_message, considering the masking issue. Given MSIx masking
> (including the MSIx table BAR) is handled at the client, the masking 
> information
> doesn’t reach the server - so msix_notify will never invoke the
> msi_send_message callback - all the vectors remain masked at the server
> end (msix_init() -> msix_mask_all()).

I was expecting vfio-user devices to be involved in MSI-X masking so
they can implement the Pending bit semantics described in the spec:

  If a masked vector has its Pending bit Set, and the associated
  underlying interrupt events are somehow satisfied (usually by software
  though the exact manner is Function-specific), the Function must Clear
  the Pending bit, to avoid sending a spurious interrupt message later
  when software unmasks the vector.

Does QEMU VFIO support this?

What happens when a hw/net/e1000e_core.c vfio-user device uses
msix_clr_pending() and related APIs?

Also, having the vfio-user daemon write to the eventfd while the vector
is masked is a waste of CPU cycles. The PCIe spec describes using MSI-X
masking for poll mode operation and going via eventfd is suboptimal:

  Software is permitted to mask one or more vectors indefinitely, and
  service their associated interrupt events strictly based on polling
  their Pending bits. A Function must Set and Clear its Pending bits as
  necessary to support this “pure polling” mode of operation.

Maybe the answer is these issues don't matter in practice because MSI-X
masking is not used much?

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]