Hi Alex,
Thanks very much for reviewing the patches.
On July 28, 2023 1:25 AM, Alex Williamson <alex.williamson@redhat.com> wrote:
On Thu, 27 Jul 2023 03:24:08 -0400
Jing Liu <jing2.liu@intel.com> wrote:
From: Reinette Chatre <reinette.chatre@intel.com>
Kernel provides the guidance of dynamic MSI-X allocation support of
passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to
guide user space.
Fetch and store the flags from host for later use to determine if
specific flags are set.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
---
hw/vfio/pci.c | 12 ++++++++++++
hw/vfio/pci.h | 1 +
hw/vfio/trace-events | 2 ++
3 files changed, 15 insertions(+)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index
a205c6b1130f..0c4ac0873d40 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1572,6 +1572,7 @@ static void vfio_msix_early_setup(VFIOPCIDevice
*vdev, Error **errp)
static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error
**errp) {
+ struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
int ret;
Error *err = NULL;
@@ -1624,6 +1625,17 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int
pos, Error **errp)
memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
}
+ irq_info.index = VFIO_PCI_MSIX_IRQ_INDEX;
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info);
+ if (ret) {
+ /* This can fail for an old kernel or legacy PCI dev */
+ trace_vfio_msix_setup_get_irq_info_failure(strerror(errno));
We only call vfio_msix_setup() if the device has an MSI-X capability, so the
"legacy PCI" portion of this comment seems unjustified.
Otherwise the GET_IRQ_INFO ioctl has always existed, so I'd also question the
"old kernel" part of this comment.
Oh, yes, I just realize that only VFIO_PCI_ERR_IRQ_INDEX and
VFIO_PCI_REQ_IRQ_INDEX were added later in include/uapi/linux/vfio.h. Thus,
this ioctl() with MSIX index would not fail by the old-kernel or legacy-PCI
reason.
Thanks for pointing out this to me.
We don't currently sanity test the device
exposed MSI-X info versus that reported by GET_IRQ_INFO, but it seems valid to
do so.
Do we want to keep the check of possible failure from kernel (e.g., -EFAULT)
and report
the error code back to caller? Maybe like this,
static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp)
{
....
msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info);
if (ret < 0) {
error_setg_errno(errp, -ret, "failed to get MSI-X IRQ INFO");
return;
} else {
vdev->msix->noresize = !!(irq_info.flags & VFIO_IRQ_INFO_NORESIZE);
}
...
trace_vfio_msix_early_setup(vdev->vbasedev.name, pos, msix->table_bar,
msix->table_offset, msix->entries,
vdev->msix->noresize);
Thanks,
Jing
Alex
} VFIOMSIXInfo;
#define TYPE_VFIO_PCI "vfio-pci"
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index
ee7509e68e4f..7d4a398f044d 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -28,6 +28,8 @@ vfio_pci_read_config(const char *name, int addr, int
len, int val) " (%s, @0x%x, vfio_pci_write_config(const char *name, int addr,
int
val, int len) " (%s, @0x%x, 0x%x, len=0x%x)"
vfio_msi_setup(const char *name, int pos) "%s PCI MSI CAP @0x%x"
vfio_msix_early_setup(const char *name, int pos, int table_bar, int offset,
int
entries) "%s PCI MSI-X CAP @0x%x, BAR %d, offset 0x%x, entries %d"
+vfio_msix_setup_get_irq_info_failure(const char *errstr)
"VFIO_DEVICE_GET_IRQ_INFO failure: %s"
+vfio_msix_setup_irq_info_flags(const char *name, uint32_t flags) " (%s) MSI-X
irq info flags 0x%x"
vfio_check_pcie_flr(const char *name) "%s Supports FLR via PCIe cap"
vfio_check_pm_reset(const char *name) "%s Supports PM reset"
vfio_check_af_flr(const char *name) "%s Supports FLR via AF cap"