qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v9 14/18] vfio: improve vfio_pci_hot_reset to supp


From: Chen Fan
Subject: Re: [Qemu-devel] [RFC v9 14/18] vfio: improve vfio_pci_hot_reset to support more case
Date: Thu, 18 Jun 2015 18:27:24 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0


On 06/17/2015 11:23 PM, Alex Williamson wrote:
On Wed, 2015-06-17 at 14:28 +0800, Chen Fan wrote:
On 06/16/2015 10:08 PM, Alex Williamson wrote:
On Tue, 2015-06-16 at 16:10 +0800, Chen Fan wrote:
On 06/10/2015 05:24 AM, Alex Williamson wrote:
On Tue, 2015-06-09 at 11:37 +0800, Chen Fan wrote:
the vfio_pci_hot_reset differentiate the single and multi in-used
devices for reset. but sometimes we own the group without any devices,
that also should support hot reset.
Nope, did you try it?  It can be done, but the group still needs to be
connected to a container for isolation.
I'm sorry for that. because I have no such host in hand. but I think if
we can keep connect container for each affected group, we also able
to use this method to do host bus reset.
All you need is a dual-port card with isolation, which includes all
Intel 1G NICs (igb & e1000e) as of the quirks that are currently in
linux-next to be pushed for v4.2.  Intel 10G NICs are already quirked
upstream.  There are certainly ways to fake isolation for testing as
well.  Thanks,
I just have a Intel Corporation 82576 dual-port card, but how can I fake
isolation group for this card in linux-next kernel? can you tell me the
document link?
If you're running linux-next and have the card installed under a root
port that provides isolation then each port should be in a separate
iommu group.  Nearly all Intel PCH root ports should have quirks to
enable isolation.  If you're installing it in a processor root port
slot, you need to use a Xeon E5 or better CPU or else the lack of
isolation at the root port will negate the isolation capabilities of the
endpoint.  Thanks,
Hi Alex,

I had test the case with isolation groups in latest linux-next,
I can see the dual-port devices in host looks like:
#readlink /sys/bus/pci/devices/0000\:06\:00.0/iommu_group
../../../../kernel/iommu_groups/30
#readlink /sys/bus/pci/devices/0000\:06\:00.1/iommu_group
../../../../kernel/iommu_groups/31

I used my v10 qemu code that added affected groups to VM to
test the case with qemu command:
qemu-system-x86_64 -M q35 -device ioh3420,bus=pcie.0,addr=1c.0,port=1,id=bridge1
-device vfio-pci,host=06:00.0,bus=bridge1,aer=true --enable-kvm

then I used to emulate the aer with aer-inject in host. I could find the aer
recovery successful in guest. and the 6:00.0 NIC can reuse by network-manage.
but when I re-binding the dual devices to host. the host show error:

Jun 18 19:13:36 TX300I kernel: [<ffffffff81666913>] dump_stack+0x45/0x57
Jun 18 19:13:36 TX300I kernel: [<ffffffff8107986a>] warn_slowpath_common+0x8a/0xc0 Jun 18 19:13:36 TX300I kernel: [<ffffffff810798f5>] warn_slowpath_fmt+0x55/0x70
Jun 18 19:13:36 TX300I kernel: [<ffffffff81326818>] bad_io_access+0x38/0x40
Jun 18 19:13:36 TX300I kernel: [<ffffffff81326a57>] pci_iounmap+0x27/0x40
Jun 18 19:13:36 TX300I kernel: [<ffffffffa01254ad>] igb_probe+0xafd/0x1280 [igb] Jun 18 19:13:36 TX300I kernel: [<ffffffff8134a6d5>] local_pci_probe+0x45/0xa0 Jun 18 19:13:36 TX300I kernel: [<ffffffff8134b884>] ? pci_match_device+0xf4/0x120 Jun 18 19:13:36 TX300I kernel: [<ffffffff8134b9d9>] pci_device_probe+0xe9/0x130 Jun 18 19:13:36 TX300I kernel: [<ffffffff8143547f>] driver_probe_device+0x14f/0x420
Jun 18 19:13:36 TX300I kernel: [<ffffffff8143389c>] bind_store+0xdc/0x120
Jun 18 19:13:36 TX300I kernel: [<ffffffff81432f74>] drv_attr_store+0x24/0x30
Jun 18 19:13:36 TX300I kernel: [<ffffffff8126de9a>] sysfs_kf_write+0x3a/0x50
Jun 18 19:13:36 TX300I kernel: [<ffffffff8126d520>] kernfs_fop_write+0x120/0x170
Jun 18 19:13:36 TX300I kernel: [<ffffffff811f13a7>] __vfs_write+0x37/0x100
Jun 18 19:13:36 TX300I kernel: [<ffffffff811f40d8>] ? __sb_start_write+0x58/0x110
Jun 18 19:13:36 TX300I kernel: [<ffffffff811f1aa9>] vfs_write+0xa9/0x190
Jun 18 19:13:36 TX300I kernel: [<ffffffff81023556>] ? do_audit_syscall_entry+0x66/0x70
Jun 18 19:13:36 TX300I kernel: [<ffffffff811f28a5>] SyS_write+0x55/0xc0
Jun 18 19:13:36 TX300I kernel: [<ffffffff8166d72e>] entry_SYSCALL_64_fastpath+0x12/0x71
Jun 18 19:13:36 TX300I kernel: ---[ end trace b312cb051751fac4 ]---
Jun 18 19:13:36 TX300I kernel: igb: probe of 0000:06:00.1 failed with error -5

the 06:00.0 device can be initialized by igb driver. but the affected 06:00.1 can't be initialized by igb driver.
Is not the 06:00.1 device state initialized by reset ?

Thanks,
Chen




Alex

Signed-off-by: Chen Fan <address@hidden>
---
    hw/vfio/pci.c | 11 +++++++++++
    1 file changed, 11 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index a4e8658..6507f39 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3398,6 +3398,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
            PCIHostDeviceAddress host;
            VFIOPCIDevice *tmp;
            VFIODevice *vbasedev_iter;
+        bool found;
host.domain = devices[i].segment;
            host.bus = devices[i].bus;
@@ -3427,6 +3428,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
                goto out;
            }
+ found = false;
            /* Prep dependent devices for reset and clear our marker. */
            QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
                if (vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) {
@@ -3438,12 +3440,21 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
                        ret = -EINVAL;
                        goto out_single;
                    }
+                found = true;
                    vfio_pci_pre_reset(tmp);
                    tmp->vbasedev.needs_reset = false;
                    multi = true;
                    break;
                }
            }
+
+        /*
+         * If we own the group but does not own the device, we also
+         * should call hot reset with multi.
+         */
+        if (!single && !found) {
+            multi = true;
+        }
        }
if (!single && !multi) {




.



.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]