qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 15/23] vfio-user: forward msix BAR accesses to server


From: John Johnson
Subject: Re: [PATCH v2 15/23] vfio-user: forward msix BAR accesses to server
Date: Wed, 8 Feb 2023 06:38:30 +0000


> On Feb 6, 2023, at 12:33 PM, Alex Williamson <alex.williamson@redhat.com> 
> wrote:
> 
> On Wed,  1 Feb 2023 21:55:51 -0800
> John Johnson <john.g.johnson@oracle.com> wrote:
> 
>> Server holds device current device pending state
>> Use irq masking commands in socket case
>> 
>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>> ---
>> hw/vfio/pci.h                 |  1 +
>> include/hw/vfio/vfio-common.h |  3 ++
>> hw/vfio/ccw.c                 |  1 +
>> hw/vfio/common.c              | 26 ++++++++++++++++++
>> hw/vfio/pci.c                 | 23 +++++++++++++++-
>> hw/vfio/platform.c            |  1 +
>> hw/vfio/user-pci.c            | 64 
>> +++++++++++++++++++++++++++++++++++++++++++
>> 7 files changed, 118 insertions(+), 1 deletion(-)
>> 
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index 4f70664..d3e5d5f 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -113,6 +113,7 @@ typedef struct VFIOMSIXInfo {
>>     uint32_t table_offset;
>>     uint32_t pba_offset;
>>     unsigned long *pending;
>> +    MemoryRegion *pba_region;
>> } VFIOMSIXInfo;
>> 
>> /*
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>> index bbc4b15..2c58d7d 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -143,6 +143,7 @@ typedef struct VFIODevice {
>>     bool ram_block_discard_allowed;
>>     bool enable_migration;
>>     bool use_regfds;
>> +    bool can_mask_irq;
> 
> How can this be a device level property?  vfio-pci (kernel) supports
> masking of INTx.  It seems like there needs to be a per-index info or
> probe here to to support this.
> 

        It is only used for MSIX masking.  MSI masking is done with
config space stores, and vfio-kernel and vfio-user handle them the
same by forwarding the config space writes to the server so it can
mask interrupts.

        MSIX is trickier because the mask bits are in a memory region.
vfio-kernel doesn’t support SET_IRQS on MSIX vectors, so if the host
doesn’t allow client mapping of the MSIX table to do the masking, vfio
switches a masked vector’s event FD destination from KVM to QEMU, then
uses the QEMU PCI emulation to mask the vector.

        vfio-user has to use messages to mask MSIX vectors, since there
is no host config space to map.  I originally forwarded the MSIX table
writes to the server to do the masking, but the feedback on the vfio-user
server changes was to add SET_IRQS support.

        I can make this a PCI MSIX-specific bool if that helps.


> 
>>     VFIODeviceOps *ops;
>>     VFIODeviceIO *io;
>>     unsigned int num_irqs;
>> @@ -239,6 +240,8 @@ void vfio_put_base_device(VFIODevice *vbasedev);
>> void vfio_disable_irqindex(VFIODevice *vbasedev, int index);
>> void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index);
>> void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index);
>> +void vfio_unmask_single_irq(VFIODevice *vbasedev, int index, int irq);
>> +void vfio_mask_single_irq(VFIODevice *vbasedev, int index, int irq);
>> int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,
>>                            int action, int fd, Error **errp);
>> void vfio_region_write(void *opaque, hwaddr addr,
>> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>> index 00605bd..bf67670 100644
>> --- a/hw/vfio/ccw.c
>> +++ b/hw/vfio/ccw.c
>> @@ -616,6 +616,7 @@ static void vfio_ccw_get_device(VFIOGroup *group, 
>> VFIOCCWDevice *vcdev,
>>     vcdev->vdev.dev = &vcdev->cdev.parent_obj.parent_obj;
>>     vcdev->vdev.io = &vfio_dev_io_ioctl;
>>     vcdev->vdev.use_regfds = false;
>> +    vcdev->vdev.can_mask_irq = false;
>> 
>>     return;
>> 
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index de64e53..0c1cb21 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -102,6 +102,32 @@ void vfio_mask_single_irqindex(VFIODevice *vbasedev, 
>> int index)
>>     vbasedev->io->set_irqs(vbasedev, &irq_set);
>> }
>> 
>> +void vfio_mask_single_irq(VFIODevice *vbasedev, int index, int irq)
>> +{
>> +    struct vfio_irq_set irq_set = {
>> +        .argsz = sizeof(irq_set),
>> +        .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
>> +        .index = index,
>> +        .start = irq,
>> +        .count = 1,
>> +    };
>> +
>> +    vbasedev->io->set_irqs(vbasedev, &irq_set);
>> +}
>> +
>> +void vfio_unmask_single_irq(VFIODevice *vbasedev, int index, int irq)
>> +{
>> +    struct vfio_irq_set irq_set = {
>> +        .argsz = sizeof(irq_set),
>> +        .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
>> +        .index = index,
>> +        .start = irq,
>> +        .count = 1,
>> +    };
>> +
>> +    vbasedev->io->set_irqs(vbasedev, &irq_set);
>> +}
>> +
>> static inline const char *action_to_str(int action)
>> {
>>     switch (action) {
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 42e7c82..7b16f8f 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -477,6 +477,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
>> unsigned int nr,
>> {
>>     VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
>>     VFIOMSIVector *vector;
>> +    bool new_vec = false;
>>     int ret;
>> 
>>     trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr);
>> @@ -490,6 +491,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
>> unsigned int nr,
>>             error_report("vfio: Error: event_notifier_init failed");
>>         }
>>         vector->use = true;
>> +        new_vec = true;
>>         msix_vector_use(pdev, nr);
>>     }
>> 
>> @@ -516,6 +518,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
>> unsigned int nr,
>>                 kvm_irqchip_commit_route_changes(&vfio_route_change);
>>                 vfio_connect_kvm_msi_virq(vector);
>>             }
>> +            new_vec = true;
> 
> This looks wrong to me, can't we get here when a previously used vector
> is unmasked?
> 

        It’s for the case where we upgrade the vector from using an event FD
that terminates in QEMU to one that terminates in KVM.  In that case we want
to send the FD again, not just unmask the current FD.



>>         }
>>     }
>> 
>> @@ -523,6 +526,8 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
>> unsigned int nr,
>>      * We don't want to have the host allocate all possible MSI vectors
>>      * for a device if they're not in use, so we shutdown and incrementally
>>      * increase them as needed.
>> +     * Otherwise, unmask the vector if the vector is already setup (and we 
>> can
>> +     * do so) or send the fd if not.
>>      */
>>     if (vdev->nr_vectors < nr + 1) {
>>         vdev->nr_vectors = nr + 1;
>> @@ -533,6 +538,8 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
>> unsigned int nr,
>>                 error_report("vfio: failed to enable vectors, %d", ret);
>>             }
>>         }
>> +    } else if (vdev->vbasedev.can_mask_irq && !new_vec) {
>> +        vfio_unmask_single_irq(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, 
>> nr);
> 
> The correctness of @new_vec is doubtful here, but masking support must
> be in reference to a specific IRQ index.  INTx masking is implicit, but
> we'd probably need something on the VFIOPCIDevice to indicate support
> for MSI and MSIX masking support, separately.
> 

        MSI masking uses config writes, so this is only for MSIX.


>>     } else {
>>         Error *err = NULL;
>>         int32_t fd;
>> @@ -574,6 +581,12 @@ static void vfio_msix_vector_release(PCIDevice *pdev, 
>> unsigned int nr)
>> 
>>     trace_vfio_msix_vector_release(vdev->vbasedev.name, nr);
>> 
>> +    /* just mask vector if peer supports it */
>> +    if (vdev->vbasedev.can_mask_irq) {
>> +        vfio_mask_single_irq(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, nr);
>> +        return;
>> +    }
>> +
>>     /*
>>      * There are still old guests that mask and unmask vectors on every
>>      * interrupt.  If we're using QEMU bypass with a KVM irqfd, leave all of
>> @@ -644,7 +657,7 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev)
>>         if (ret) {
>>             error_report("vfio: failed to enable vectors, %d", ret);
>>         }
>> -    } else {
>> +    } else if (!vdev->vbasedev.can_mask_irq) {
>>         /*
>>          * Some communication channels between VF & PF or PF & fw rely on the
>>          * physical state of the device and expect that enabling MSI-X from 
>> the
>> @@ -660,6 +673,13 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev)
>>          */
>>         vfio_msix_vector_do_use(&vdev->pdev, 0, NULL, NULL);
>>         vfio_msix_vector_release(&vdev->pdev, 0);
>> +    } else {
>> +        /*
>> +         * If we can use irq masking, send an invalid fd on vector 0
>> +         * to enable MSI-X without any vectors enabled.
>> +         */
>> +        vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, 0,
>> +                               VFIO_IRQ_SET_ACTION_TRIGGER, -1, NULL);
> 
> What does this have to do with masking?  I'm not entirely sure this
> doesn't also work on vfio-pci (kernel).
> 
> In general this feels like a special cased version of MSI/X masking
> support rather than actually reworking things to support underlying
> masking support for these indexes.  Thanks,
> 


        The comment in the !can_mask_irq clause sounded like vfio-kernel
needs a use/release cycle, so I didn’t want to change its behavior.

                                                                JJ




reply via email to

[Prev in Thread] Current Thread [Next in Thread]