[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] how Windows treats BARs of driver-less devices when oth
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] how Windows treats BARs of driver-less devices when other devices are hotplugged |
Date: |
Thu, 25 Feb 2016 15:30:57 +0200 |
On Thu, Feb 25, 2016 at 02:00:09PM +0100, Laszlo Ersek wrote:
> On 02/25/16 13:44, Laszlo Ersek wrote:
> > Hi,
> >
> > On 02/25/16 12:57, Michael S. Tsirkin wrote:
> >> ----- Forwarded message from Igor Mammedov <address@hidden> -----
> >>
> >> Date: Thu, 11 Feb 2016 16:16:05 +0100
> >> From: Igor Mammedov <address@hidden>
> >> To: "Michael S. Tsirkin" <address@hidden>
> >> To: address@hidden
> >> Subject: on pci rebalancing
> >> Message-ID: <address@hidden>
> >> In-Reply-To: <address@hidden>
> >>
> >>>>>> For PCI rebalance to work on Windows, one has to provide working PCI
> >>>>>> driver
> >>>>>> otherwise OS will ignore it when rebalancing happens and
> >>>>>> might map something else over ignored BAR.
> >>>>>
> >>>>> Does it disable the BAR then? Or just move it elsewhere?
> >>>> it doesn't, it just blindly ignores BARs existence and maps BAR of
> >>>> another device with driver over it.
> >>>
> >>> Interesting. On classical PCI this is a forbidden configuration.
> >>> Maybe we do something that confuses windows?
> >>> Could you tell me how to reproduce this behaviour?
> >> #cat > t << EOF
> >> pci_update_mappings_del
> >> pci_update_mappings_add
> >> EOF
> >>
> >> #./x86_64-softmmu/qemu-system-x86_64 -snapshot -enable-kvm -snapshot \
> >> -monitor unix:/tmp/m,server,nowait -device pci-bridge,chassis_nr=1 \
> >> -boot menu=on -m 4G -trace events=t ws2012r2x64dc.img \
> >> -device ivshmem,id=foo,size=2M,shm,bus=pci.1,addr=01
> >>
> >> wait till OS boots, note BARs programmed for ivshmem
> >> in my case it was
> >> 01:01.0 0,0xfe800000+0x100
> >> then execute script and watch pci_update_mappings* trace events
> >>
> >> # for i in $(seq 3 18); do printf -- "device_add
> >> e1000,bus=pci.1,addr=%x\n" $i | nc -U /tmp/m; sleep 5; done;
> >>
> >> hotplugging e1000,bus=pci.1,addr=12 triggers rebalancing where
> >> Windows unmaps all BARs of nics on bridge but doesn't touch ivshmem
> >> and then programs new BARs, where:
> >> pci_update_mappings_add d=0x7fa02ff0cf90 01:11.0 0,0xfe800000+0x20000
> >> creates overlapping BAR with ivshmem
> >
> > Michael informed me of this on IRC (and forwarded this email to me). I hope
> > to start a new thread with my response. (I also reedited the subject fully.)
> >
> > So, to summarize what I said on IRC first. The situation where firmware
> > recognizes and enables a PCI device, hands control to the OS, and then the
> > OS lacks a driver for the PCI device, is completely normal and expected.
> > For UEFI specifically, I can name a general argument and a specific
> > argument.
> >
> > The general argument is that actions that need to be taken in
> > ExitBootServices() callbacks do not include clearing IO or MMIO decode bits
> > in PCI device command registers. Command register manipulation happens when
> > a PCI device driver (that conforms to the UEFI driver model) *binds* or
> > *unbinds* a device. And unbinding a device is not possible in the
> > ExitBootServices() callback, minimally because such callbacks are forbidden
> > from modifying the memory map -- but unbinding would release allocated
> > memory.
> >
> > So what we use such callbacks for is aborting in-flight, outstanding
> > DMA-like transfers. Re-setting virtio devices is also an example (think
> > outstanding receive requests for virtio-net).
> >
> > Now let's move on to the specific argument I mentioned above. The Graphics
> > Output Protocol (GOP) is a UEFI abstraction that was specifically designed
> > with the case in mind when the operating system doesn't have a display
> > driver -- yet installed --, but the user obviously has to use the display
> > somehow. The GOP is most frequently provided on top of an
> > EFI_PCI_IO_PROTOCOL instance; meaning simply that the "GOP driver" is a
> > UEFI driver that drives a PCI device. In short, the driver provides the GOP
> > on top of a PCI device.
> >
> > Now, the GOP is supposed to communicate the pixel format and the frame
> > buffer base address for the currently active graphics mode to the software
> > that consumes the GOP. This includes UEFI applications of course (think a
> > boot loader putting up a splash screen or an anmiation), but importantly,
> > the runtime OS is *also* supposed to inherit these characteristics from
> > boot services time. The OS can then use simple unaccelerated MMIO writes to
> > display things on the screen, until the users installs an accelerated
> > driver.
> >
> > (Concrete example: this is why you can see *anything at all* on the screen,
> > when you run e.g. Windows Server 2012 R2 on top of OVMF and a QXL display,
> > before installing the QXL WDDM driver in the guest.)
> >
> > Clearly, the frame buffer base address communicated through the GOP points
> > into one of the MMIO BARs of the PCI device. If, at ExitBootServices(),
> > MMIO decoding were disabled for the PCI device that underlies the GOP, that
> > would *completely* defeat the GOP design. The OS's attempt to poke at those
> > MMIO addresses would be futile -- and in fact the OS has no idea what PCI
> > device (if any) the framebuffer is supposed to be related to. This is the
> > jurisdiction of the OS-level display driver -- if one exists and is
> > installed.
> >
> > So, this is a Windows bug in my option. Just because there is no OS-level
> > driver, a PCI device is fully expected to be decoding resources, if the
> > firmware brought it up.
> >
> > --*--
> >
> > Okay, so Michael asked me to try to reproduce the above with OVMF, and see
> > what happens. Unfortunately I'm not really knowledgeable about ivshmem,
> > hotplug, et cetera. Let me instead tell Igor about using OVMF.
> >
> > (1) Please follow the instructions on Gerd's page
> > <https://www.kraxel.org/repos/>, and install the "edk2.git-ovmf-x64"
> > package.
> >
> > (2) Create a separate directory for testing. In this directory, run the
> > following command:
> >
> > cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd myvars.fd
> >
> > Also create a disk image for your new guest, etc.
> >
> > (3) Use the following command line snippet to work with OVMF:
> >
> > qemu-system-x86_64 \
> > -machine accel=kvm \
> > -smp cpus=2 \
> > -m 2048 \
> > \
> > -debugcon file:ovmf.debug.log \
> > -global isa-debugcon.iobase=0x402 \
> > \
> > -device qxl-vga \
> > \
> > -drive
> > if=pflash,format=raw,unit=0,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd
> > \
> > -drive if=pflash,format=raw,unit=1,file=myvars.fd \
> > \
> > [your options here]
> >
> > You can of course customize the # of VCPUs, memory size, disks, CD-ROMs,
> > network, and so on.
> >
> > Recommended: when you use the -device option to add the disk and the
> > CD-ROM(s) to install the OS (and driver(s)) from, be sure to use the
> > "bootindex" property. OVMF will adhere to the boot order. It is recommended
> > to set bootindex=0 for your main disk, bootindex=1 for your OS installer
> > CD-ROM, and *no* bootindex for your virtio-win driver disk. This way at
> > first boot (with no OS installed) OVMF will boot the installer CD-ROM.
> > Further boots (with the same command line) will boot the installed OS.
> >
> > Caveat: I never used the -snapshot option with OVMF virtual machines; it
> > might or might not work.
> >
> > Caveat #2: I had tested simple PCI hotplug and hot-unplug with Windows
> > running on OVMF many months ago, but I can't tell off-hand if it will work
> > right now.
>
> I should also mention that you might not be able to reproduce the same
> situation with the "ivshmem" device. Namely, if there is no UEFI driver
> for that PCI device (and OVMF certainly doesn't have one), then its MMIO
> and IO decoding bits will *never* be set. As I said, command register
> massaging is the jurisdiction of the individual UEFI driver that
> ultimately binds the device -- and OVMF has no UEFI driver for ivshmem.
>
> Therefore you should probably try to reproduce the issue with another
> PCI device type that OVMF has a driver for, but Windows has none
> (installed at least). I'm quite hard pressed to name such a device type,
> unfortunately. :(
virtio?
> Perhaps one of the more obscure emulated NICs could work in place of
> ivshmem. (The IPXE oproms provide UEFI drivers for those.)
>
> Thanks
> Laszlo