qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v5 02/11] pci: add option for net failover


From: Parav Pandit
Subject: RE: [PATCH v5 02/11] pci: add option for net failover
Date: Thu, 24 Oct 2019 16:34:01 +0000


> -----Original Message-----
> From: Jens Freimann <address@hidden>
> Sent: Thursday, October 24, 2019 4:38 AM
> To: Parav Pandit <address@hidden>
> Cc: address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden; address@hidden;
> address@hidden
> Subject: Re: [PATCH v5 02/11] pci: add option for net failover
> 
> On Thu, Oct 24, 2019 at 05:03:46AM +0000, Parav Pandit wrote:
> >> @@ -2101,6 +2104,20 @@ static void pci_qdev_realize(DeviceState
> >> *qdev, Error **errp)
> >>          }
> >>      }
> >>
> >> +    if (pci_dev->net_failover_pair_id) {
> >> +        if (!pci_is_express(pci_dev)) {
> >
> >I am testing and integrating this piece with mlx5 devices.
> >I see that pci_is_express() return true only for first PCI function.
> >Didn't yet dig the API.
> >Commenting out this check and below class check progresses further.
> 
> First of all, thanks for testing this!
> Could you share your commandline please? I can't reproduce it.
> >
I added debug prints to get the difference between VF1 and VF2 behavior.
What I see is, vfio_populate_device() below code is activated for VF2 where 
qemu claims that its not a PCIe device.

    vdev->config_size = reg_info->size;
    if (vdev->config_size == PCI_CONFIG_SPACE_SIZE) {
        vdev->pdev.cap_present &= ~QEMU_PCI_CAP_EXPRESS;
        printf("%s clearing QEMU PCI bits\n", __func__);
    }

Command line:
/usr/local/bin/qemu-system-x86_64 -enable-kvm -m 3072 -smp 3 \
               -machine q35,usb=off,vmport=off,dump-guest-core=off -cpu 
Haswell-noTSX-IBRS \
           -net none \
               -qmp unix:/tmp/qmp.socket,server,nowait \
        -monitor telnet:127.0.0.1:5556,server,nowait \
        -device pcie-root-port,id=root0,multifunction=on,chassis=0,addr=0xa \
        -device pcie-root-port,id=root1,bus=pcie.0,chassis=1 \
        -device pcie-root-port,id=root2,bus=pcie.0,chassis=2 \
        -netdev tap,id=hostnet1,fd=4 4<>/dev/tap49\
        -device 
virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:02:02:02,bus=root2,failover=on
 \
        -device 
vfio-pci,id=hostdev0,host=05:00.2,bus=root1,net_failover_pair_id=net1 \
        /var/lib/libvirt/images/sriov-lm-02.qcow2

> >While reviewing, I realized that we shouldn't have this check for below
> reasons.
> >
> >1. It is user's responsibility to pass networking device.
> >But its ok to check the class, if PCI Device is passed.
> >So class comparison should be inside the pci_check().
> 
> I'm not sure I understand this point, could you please elaborate?
> You're suggesting to move the check for the class into the check for
> pci_is_express?
> 
No. Below is the suggestion.

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 8fbf32d68c..8004309973 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2105,17 +2105,14 @@ static void pci_qdev_realize(DeviceState *qdev, Error 
**errp)
     }

     if (pci_dev->net_failover_pair_id) {
-        if (!pci_is_express(pci_dev)) {
-            error_setg(errp, "failover device is not a PCIExpress device");
-            error_propagate(errp, local_err);
-            return;
-        }
-        class_id = pci_get_word(pci_dev->config + PCI_CLASS_DEVICE);
-        if (class_id != PCI_CLASS_NETWORK_ETHERNET) {
-            error_setg(errp, "failover device is not an Ethernet device");
-            error_propagate(errp, local_err);
-            return;
-        }
+        if (pci_is_express(pci_dev)) {
+            class_id = pci_get_word(pci_dev->config + PCI_CLASS_DEVICE);
+            if (class_id != PCI_CLASS_NETWORK_ETHERNET) {
+                error_setg(errp, "failover device is not an Ethernet device");
+                error_propagate(errp, local_err);
+                return;
+            }
+       }

This will allow to map non PCI device as failover too.
After writing above hunk I think that when code reaches to check for 
If (pci_dev->net_failover_pair_id),... it is already gone gone through 
do_pci_register_device().
There should not be any check needed again for pci_is_express().
Isn't it?


> >2. It is limiting to only consider PCI devices.
> >Automated and regression tests should be able validate this feature without
> PCI Device.
> >This will enhance the stability of feature in long run.
> >
> >3. net failover driver doesn't limit it to have it over only PCI device.
> >So similarly hypervisor should be limiting.
> 
> I agree that we don't have to limit it to PCI(e) forever. But for this first 
> shot I
> think we should and then extend it continually. There are more things we can
> support in the future like other hotplug types etc.
> 
o.k. But probably net_failover_pair_id field should be in DeviceState instead 
of PCIDevice at minimum?
Or you want to refactor it later?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]