[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug |
Date: |
Mon, 10 Dec 2018 23:25:34 -0500 |
On Tue, Dec 11, 2018 at 03:51:09AM +0000, xuyandong wrote:
> > On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote:
> > > On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote:
> > > > > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > In our test, we configured VM with several pci-bridges and
> > > > > > > > > a virtio-net nic been attached with bus 4,
> > > > > > > > >
> > > > > > > > > After VM is startup, We ping this nic from host to judge
> > > > > > > > > if it is working normally. Then, we hot add pci devices to
> > > > > > > > > this VM with bus
> > > > 0.
> > > > > > > > >
> > > > > > > > > We found the virtio-net NIC in bus 4 is not working (can
> > > > > > > > > not
> > > > > > > > > connect) occasionally, as it kick virtio backend failure with
> > > > > > > > > error
> > below:
> > > > > > > > >
> > > > > > > > > Unassigned mem write 00000000fc803004 = 0x1
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > memory-region: pci_bridge_pci
> > > > > > > > >
> > > > > > > > > 0000000000000000-ffffffffffffffff (prio 0, RW):
> > > > > > > > > pci_bridge_pci
> > > > > > > > >
> > > > > > > > > 00000000fc800000-00000000fc803fff (prio 1, RW):
> > > > > > > > > virtio-pci
> > > > > > > > >
> > > > > > > > > 00000000fc800000-00000000fc800fff (prio 0, RW):
> > > > > > > > > virtio-pci-common
> > > > > > > > >
> > > > > > > > > 00000000fc801000-00000000fc801fff (prio 0, RW):
> > > > > > > > > virtio-pci-isr
> > > > > > > > >
> > > > > > > > > 00000000fc802000-00000000fc802fff (prio 0, RW):
> > > > > > > > > virtio-pci-device
> > > > > > > > >
> > > > > > > > > 00000000fc803000-00000000fc803fff (prio 0, RW):
> > > > > > > > > virtio-pci-notify <- io mem unassigned
> > > > > > > > >
> > > > > > > > > …
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We caught an exceptional address changing while this
> > > > > > > > > problem happened, show as
> > > > > > > > > follow:
> > > > > > > > >
> > > > > > > > > Before pci_bridge_update_mappings:
> > > > > > > > >
> > > > > > > > > 00000000fc000000-00000000fc1fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc000000-00000000fc1fffff
> > > > > > > > >
> > > > > > > > > 00000000fc200000-00000000fc3fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc200000-00000000fc3fffff
> > > > > > > > >
> > > > > > > > > 00000000fc400000-00000000fc5fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc400000-00000000fc5fffff
> > > > > > > > >
> > > > > > > > > 00000000fc600000-00000000fc7fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc600000-00000000fc7fffff
> > > > > > > > >
> > > > > > > > > 00000000fc800000-00000000fc9fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc800000-00000000fc9fffff
> > > > > > > > > <- correct Adress Spce
> > > > > > > > >
> > > > > > > > > 00000000fca00000-00000000fcbfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fca00000-00000000fcbfffff
> > > > > > > > >
> > > > > > > > > 00000000fcc00000-00000000fcdfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fcc00000-00000000fcdfffff
> > > > > > > > >
> > > > > > > > > 00000000fce00000-00000000fcffffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fce00000-00000000fcffffff
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > After pci_bridge_update_mappings:
> > > > > > > > >
> > > > > > > > > 00000000fda00000-00000000fdbfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fda00000-00000000fdbfffff
> > > > > > > > >
> > > > > > > > > 00000000fdc00000-00000000fddfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fdc00000-00000000fddfffff
> > > > > > > > >
> > > > > > > > > 00000000fde00000-00000000fdffffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fde00000-00000000fdffffff
> > > > > > > > >
> > > > > > > > > 00000000fe000000-00000000fe1fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe000000-00000000fe1fffff
> > > > > > > > >
> > > > > > > > > 00000000fe200000-00000000fe3fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe200000-00000000fe3fffff
> > > > > > > > >
> > > > > > > > > 00000000fe400000-00000000fe5fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe400000-00000000fe5fffff
> > > > > > > > >
> > > > > > > > > 00000000fe600000-00000000fe7fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe600000-00000000fe7fffff
> > > > > > > > >
> > > > > > > > > 00000000fe800000-00000000fe9fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe800000-00000000fe9fffff
> > > > > > > > >
> > > > > > > > > fffffffffc800000-fffffffffc800000 (prio 1, RW):
> > > > > > > > > alias
> > > > > > pci_bridge_pref_mem
> > > > > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <-
> > > > > > > > > Exceptional
> > > > Adress
> > > > > > > > Space
> > > > > > > >
> > > > > > > > This one is empty though right?
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We have figured out why this address becomes this value,
> > > > > > > > > according to pci spec, pci driver can get BAR address
> > > > > > > > > size by writing 0xffffffff to
> > > > > > > > >
> > > > > > > > > the pci register firstly, and then read back the value from
> > > > > > > > > this
> > register.
> > > > > > > >
> > > > > > > >
> > > > > > > > OK however as you show below the BAR being sized is the BAR
> > > > > > > > if a bridge. Are you then adding a bridge device by hotplug?
> > > > > > >
> > > > > > > No, I just simply hot plugged a VFIO device to Bus 0, another
> > > > > > > interesting phenomenon is If I hot plug the device to other
> > > > > > > bus, this doesn't
> > > > > > happened.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > We didn't handle this value specially while process pci
> > > > > > > > > write in qemu, the function call stack is:
> > > > > > > > >
> > > > > > > > > Pci_bridge_dev_write_config
> > > > > > > > >
> > > > > > > > > -> pci_bridge_write_config
> > > > > > > > >
> > > > > > > > > -> pci_default_write_config (we update the config[address]
> > > > > > > > > -> value here to
> > > > > > > > > fffffffffc800000, which should be 0xfc800000 )
> > > > > > > > >
> > > > > > > > > -> pci_bridge_update_mappings
> > > > > > > > >
> > > > > > > > > ->pci_bridge_region_del(br, br->windows);
> > > > > > > > >
> > > > > > > > > -> pci_bridge_region_init
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use
> > > > > > > > > -> the
> > > > > > > > > wrong value
> > > > > > > > > fffffffffc800000)
> > > > > > > > >
> > > > > > > > > ->
> > > > > > > > > memory_region_transaction_commit
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > So, as we can see, we use the wrong base address in qemu
> > > > > > > > > to update the memory regions, though, we update the base
> > > > > > > > > address to
> > > > > > > > >
> > > > > > > > > The correct value after pci driver in VM write the
> > > > > > > > > original value back, the virtio NIC in bus 4 may still
> > > > > > > > > sends net packets concurrently with
> > > > > > > > >
> > > > > > > > > The wrong memory region address.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We have tried to skip the memory region update action in
> > > > > > > > > qemu while detect pci write with 0xffffffff value, and it
> > > > > > > > > does work, but
> > > > > > > > >
> > > > > > > > > This seems to be not gently.
> > > > > > > >
> > > > > > > > For sure. But I'm still puzzled as to why does Linux try to
> > > > > > > > size the BAR of the bridge while a device behind it is used.
> > > > > > > >
> > > > > > > > Can you pls post your QEMU command line?
> > > > > > >
> > > > > > > My QEMU command line:
> > > > > > > /root/xyd/qemu-system-x86_64 -name
> > > > > > > guest=Linux,debug-threads=on -S -object
> > > > > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/dom
> > > > > > > ain-
> > > > > > > 194-
> > > > > > > Linux/master-key.aes -machine
> > > > > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
> > > > > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
> > > > > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off
> > > > > > > -smp
> > > > > > > 20,sockets=20,cores=1,threads=1 -numa
> > > > > > > node,nodeid=0,cpus=0-4,mem=1024 -numa
> > > > > > > node,nodeid=1,cpus=5-9,mem=1024 -numa
> > > > > > > node,nodeid=2,cpus=10-14,mem=1024 -numa
> > > > > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid
> > > > > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config
> > > > > > > -nodefaults -chardev
> > > > > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Li
> > > > > > > nux/
> > > > > > > moni
> > > > > > > tor.sock,server,nowait -mon
> > > > > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc
> > > > > > > -no-hpet -global kvm-pit.lost_tick_policy=delay -no-shutdown
> > > > > > > -boot strict=on -device
> > > > > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
> > > > > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
> > > > > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
> > > > > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
> > > > > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
> > > > > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> > > > > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
> > > > > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
> > > > > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
> > > > > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
> > > > > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
> > > > > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
> > > > > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
> > > > > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id
> > > > > > > =dri
> > > > > > > ve-v
> > > > > > > irtio-disk0,cache=none -device
> > > > > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-
> > > > > > > disk
> > > > > > > 0,id
> > > > > > > =virtio-disk0,bootindex=1 -drive
> > > > > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
> > > > > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1
> > > > > > > -netdev
> > > > > > > tap,fd=35,id=hostnet0 -device
> > > > > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,b
> > > > > > > us=p
> > > > > > > ci.4
> > > > > > > ,addr=0x1 -chardev pty,id=charserial0 -device
> > > > > > > isa-serial,chardev=charserial0,id=serial0 -device
> > > > > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
> > > > > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
> > > > > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg
> > > > > > > timestamp=on
> > > > > > >
> > > > > > > I am also very curious about this issue, in the linux kernel
> > > > > > > code, maybe double
> > > > > > check in function pci_bridge_check_ranges triggered this problem.
> > > > > >
> > > > > > If you can get the stacktrace in Linux when it tries to write
> > > > > > this fffff value, that would be quite helpful.
> > > > > >
> > > > >
> > > > > After I add mdelay(100) in function pci_bridge_check_ranges, this
> > > > > phenomenon is easier to reproduce, below is my modify in kernel:
> > > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > > > > index cb389277..86e232d 100644
> > > > > --- a/drivers/pci/setup-bus.c
> > > > > +++ b/drivers/pci/setup-bus.c
> > > > > @@ -27,7 +27,7 @@
> > > > > #include <linux/slab.h>
> > > > > #include <linux/acpi.h>
> > > > > #include "pci.h"
> > > > > -
> > > > > +#include <linux/delay.h>
> > > > > unsigned int pci_flags;
> > > > >
> > > > > struct pci_dev_resource {
> > > > > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct
> > > > > pci_bus
> > > > *bus)
> > > > > pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
> > > > > 0xffffffff);
> > > > > pci_read_config_dword(bridge,
> > > > > PCI_PREF_BASE_UPPER32, &tmp);
> > > > > + mdelay(100);
> > > > > + printk(KERN_ERR "sleep\n");
> > > > > + dump_stack();
> > > > > if (!tmp)
> > > > > b_res[2].flags &= ~IORESOURCE_MEM_64;
> > > > > pci_write_config_dword(bridge,
> > > > > PCI_PREF_BASE_UPPER32,
> > > > >
> > > >
> > > > OK!
> > > > I just sent a Linux patch that should help.
> > > > I would appreciate it if you will give it a try and if that helps
> > > > reply to it with a
> > > > Tested-by: tag.
> > > >
> > >
> > > I tested this patch and it works fine on my machine.
> > >
> > > But I have another question, if we only fix this problem in the
> > > kernel, the Linux version that has been released does not work well on the
> > virtualization platform.
> > > Is there a way to fix this problem in the backend?
> >
> > There could we a way to work around this.
> > Does below help?
>
> I am sorry to tell you, I tested this patch and it doesn't work fine.
>
> >
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
> > 236a20eaa8..7834cac4b0 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
> > *parent_scope, PCIBus *bus,
> >
> > aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
> > aml_append(method,
> > - aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check
> > */)
> > + aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device
> > + Check Light */)
> > );
> > aml_append(method,
> > aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request
> > */)
Oh I see, another bug:
case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT
event\n");
/* TBD: Exactly what does 'light' mean? */
break;
And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type)
and friends all just ignore this event type.
--
MST
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, (continued)
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, Michael S. Tsirkin, 2018/12/09
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, xuyandong, 2018/12/09
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, Michael S. Tsirkin, 2018/12/10
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, xuyandong, 2018/12/10
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, Michael S. Tsirkin, 2018/12/10
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, xuyandong, 2018/12/10
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, Michael S. Tsirkin, 2018/12/10
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, xuyandong, 2018/12/10
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug, Michael S. Tsirkin, 2018/12/10
- Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug,
Michael S. Tsirkin <=