qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] how Windows treats BARs of driver-less devices when oth


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] how Windows treats BARs of driver-less devices when other devices are hotplugged
Date: Thu, 25 Feb 2016 15:30:57 +0200

On Thu, Feb 25, 2016 at 02:00:09PM +0100, Laszlo Ersek wrote:
> On 02/25/16 13:44, Laszlo Ersek wrote:
> > Hi,
> > 
> > On 02/25/16 12:57, Michael S. Tsirkin wrote:
> >> ----- Forwarded message from Igor Mammedov <address@hidden> -----
> >>
> >> Date: Thu, 11 Feb 2016 16:16:05 +0100
> >> From: Igor Mammedov <address@hidden>
> >> To: "Michael S. Tsirkin" <address@hidden>
> >> To: address@hidden
> >> Subject: on pci rebalancing
> >> Message-ID: <address@hidden>
> >> In-Reply-To: <address@hidden>
> >>
> >>>>>> For PCI rebalance to work on Windows, one has to provide working PCI 
> >>>>>> driver
> >>>>>> otherwise OS will ignore it when rebalancing happens and
> >>>>>> might map something else over ignored BAR.    
> >>>>>
> >>>>> Does it disable the BAR then? Or just move it elsewhere?  
> >>>> it doesn't, it just blindly ignores BARs existence and maps BAR of
> >>>> another device with driver over it.  
> >>>
> >>> Interesting. On classical PCI this is a forbidden configuration.
> >>> Maybe we do something that confuses windows?
> >>> Could you tell me how to reproduce this behaviour?
> >> #cat > t << EOF
> >> pci_update_mappings_del
> >> pci_update_mappings_add
> >> EOF
> >>
> >> #./x86_64-softmmu/qemu-system-x86_64 -snapshot -enable-kvm -snapshot \
> >>  -monitor unix:/tmp/m,server,nowait -device pci-bridge,chassis_nr=1 \
> >>  -boot menu=on -m 4G -trace events=t ws2012r2x64dc.img \
> >>  -device ivshmem,id=foo,size=2M,shm,bus=pci.1,addr=01
> >>
> >> wait till OS boots, note BARs programmed for ivshmem
> >>  in my case it was
> >>    01:01.0 0,0xfe800000+0x100
> >> then execute script and watch pci_update_mappings* trace events
> >>
> >> # for i in $(seq 3 18); do printf -- "device_add 
> >> e1000,bus=pci.1,addr=%x\n" $i | nc -U /tmp/m; sleep 5; done;
> >>
> >> hotplugging e1000,bus=pci.1,addr=12 triggers rebalancing where
> >> Windows unmaps all BARs of nics on bridge but doesn't touch ivshmem
> >> and then programs new BARs, where:
> >>   pci_update_mappings_add d=0x7fa02ff0cf90 01:11.0 0,0xfe800000+0x20000
> >> creates overlapping BAR with ivshmem 
> > 
> > Michael informed me of this on IRC (and forwarded this email to me). I hope 
> > to start a new thread with my response. (I also reedited the subject fully.)
> > 
> > So, to summarize what I said on IRC first. The situation where firmware 
> > recognizes and enables a PCI device, hands control to the OS, and then the 
> > OS lacks a driver for the PCI device, is completely normal and expected. 
> > For UEFI specifically, I can name a general argument and a specific 
> > argument.
> > 
> > The general argument is that actions that need to be taken in 
> > ExitBootServices() callbacks do not include clearing IO or MMIO decode bits 
> > in PCI device command registers. Command register manipulation happens when 
> > a PCI device driver (that conforms to the UEFI driver model) *binds* or 
> > *unbinds* a device. And unbinding a device is not possible in the 
> > ExitBootServices() callback, minimally because such callbacks are forbidden 
> > from modifying the memory map -- but unbinding would release allocated 
> > memory.
> > 
> > So what we use such callbacks for is aborting in-flight, outstanding 
> > DMA-like transfers. Re-setting virtio devices is also an example (think 
> > outstanding receive requests for virtio-net).
> > 
> > Now let's move on to the specific argument I mentioned above. The Graphics 
> > Output Protocol (GOP) is a UEFI abstraction that was specifically designed 
> > with the case in mind when the operating system doesn't have a display 
> > driver -- yet installed --, but the user obviously has to use the display 
> > somehow. The GOP is most frequently provided on top of an 
> > EFI_PCI_IO_PROTOCOL instance; meaning simply that the "GOP driver" is a 
> > UEFI driver that drives a PCI device. In short, the driver provides the GOP 
> > on top of a PCI device.
> > 
> > Now, the GOP is supposed to communicate the pixel format and the frame 
> > buffer base address for the currently active graphics mode to the software 
> > that consumes the GOP. This includes UEFI applications of course (think a 
> > boot loader putting up a splash screen or an anmiation), but importantly, 
> > the runtime OS is *also* supposed to inherit these characteristics from 
> > boot services time. The OS can then use simple unaccelerated MMIO writes to 
> > display things on the screen, until the users installs an accelerated 
> > driver.
> > 
> > (Concrete example: this is why you can see *anything at all* on the screen, 
> > when you run e.g. Windows Server 2012 R2 on top of OVMF and a QXL display, 
> > before installing the QXL WDDM driver in the guest.)
> > 
> > Clearly, the frame buffer base address communicated through the GOP points 
> > into one of the MMIO BARs of the PCI device. If, at ExitBootServices(), 
> > MMIO decoding were disabled for the PCI device that underlies the GOP, that 
> > would *completely* defeat the GOP design. The OS's attempt to poke at those 
> > MMIO addresses would be futile -- and in fact the OS has no idea what PCI 
> > device (if any) the framebuffer is supposed to be related to. This is the 
> > jurisdiction of the OS-level display driver -- if one exists and is 
> > installed.
> > 
> > So, this is a Windows bug in my option. Just because there is no OS-level 
> > driver, a PCI device is fully expected to be decoding resources, if the 
> > firmware brought it up.
> > 
> > --*--
> > 
> > Okay, so Michael asked me to try to reproduce the above with OVMF, and see 
> > what happens. Unfortunately I'm not really knowledgeable about ivshmem, 
> > hotplug, et cetera. Let me instead tell Igor about using OVMF.
> > 
> > (1) Please follow the instructions on Gerd's page 
> > <https://www.kraxel.org/repos/>, and install the "edk2.git-ovmf-x64" 
> > package.
> > 
> > (2) Create a separate directory for testing. In this directory, run the 
> > following command:
> > 
> >   cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd myvars.fd
> > 
> > Also create a disk image for your new guest, etc.
> > 
> > (3) Use the following command line snippet to work with OVMF:
> > 
> >      qemu-system-x86_64 \
> >        -machine accel=kvm \
> >        -smp cpus=2 \
> >        -m 2048 \
> >        \
> >        -debugcon file:ovmf.debug.log \
> >        -global isa-debugcon.iobase=0x402 \
> >        \
> >        -device qxl-vga \
> >        \
> >        -drive 
> > if=pflash,format=raw,unit=0,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd
> >  \
> >        -drive if=pflash,format=raw,unit=1,file=myvars.fd \
> >        \
> >        [your options here]
> > 
> > You can of course customize the # of VCPUs, memory size, disks, CD-ROMs, 
> > network, and so on.
> > 
> > Recommended: when you use the -device option to add the disk and the 
> > CD-ROM(s) to install the OS (and driver(s)) from, be sure to use the 
> > "bootindex" property. OVMF will adhere to the boot order. It is recommended 
> > to set bootindex=0 for your main disk, bootindex=1 for your OS installer 
> > CD-ROM, and *no* bootindex for your virtio-win driver disk. This way at 
> > first boot (with no OS installed) OVMF will boot the installer CD-ROM. 
> > Further boots (with the same command line) will boot the installed OS.
> > 
> > Caveat: I never used the -snapshot option with OVMF virtual machines; it 
> > might or might not work.
> > 
> > Caveat #2: I had tested simple PCI hotplug and hot-unplug with Windows 
> > running on OVMF many months ago, but I can't tell off-hand if it will work 
> > right now.
> 
> I should also mention that you might not be able to reproduce the same
> situation with the "ivshmem" device. Namely, if there is no UEFI driver
> for that PCI device (and OVMF certainly doesn't have one), then its MMIO
> and IO decoding bits will *never* be set. As I said, command register
> massaging is the jurisdiction of the individual UEFI driver that
> ultimately binds the device -- and OVMF has no UEFI driver for ivshmem.
> 
> Therefore you should probably try to reproduce the issue with another
> PCI device type that OVMF has a driver for, but Windows has none
> (installed at least). I'm quite hard pressed to name such a device type,
> unfortunately. :(

virtio?

> Perhaps one of the more obscure emulated NICs could work in place of
> ivshmem. (The IPXE oproms provide UEFI drivers for those.)
> 
> Thanks
> Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]