qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] how Windows treats BARs of driver-less devices when oth


From: Laszlo Ersek
Subject: Re: [Qemu-devel] how Windows treats BARs of driver-less devices when other devices are hotplugged
Date: Thu, 25 Feb 2016 15:05:08 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 02/25/16 14:30, Michael S. Tsirkin wrote:
> On Thu, Feb 25, 2016 at 02:00:09PM +0100, Laszlo Ersek wrote:
>> On 02/25/16 13:44, Laszlo Ersek wrote:
>>> Hi,
>>>
>>> On 02/25/16 12:57, Michael S. Tsirkin wrote:
>>>> ----- Forwarded message from Igor Mammedov <address@hidden> -----
>>>>
>>>> Date: Thu, 11 Feb 2016 16:16:05 +0100
>>>> From: Igor Mammedov <address@hidden>
>>>> To: "Michael S. Tsirkin" <address@hidden>
>>>> To: address@hidden
>>>> Subject: on pci rebalancing
>>>> Message-ID: <address@hidden>
>>>> In-Reply-To: <address@hidden>
>>>>
>>>>>>>> For PCI rebalance to work on Windows, one has to provide working PCI 
>>>>>>>> driver
>>>>>>>> otherwise OS will ignore it when rebalancing happens and
>>>>>>>> might map something else over ignored BAR.    
>>>>>>>
>>>>>>> Does it disable the BAR then? Or just move it elsewhere?  
>>>>>> it doesn't, it just blindly ignores BARs existence and maps BAR of
>>>>>> another device with driver over it.  
>>>>>
>>>>> Interesting. On classical PCI this is a forbidden configuration.
>>>>> Maybe we do something that confuses windows?
>>>>> Could you tell me how to reproduce this behaviour?
>>>> #cat > t << EOF
>>>> pci_update_mappings_del
>>>> pci_update_mappings_add
>>>> EOF
>>>>
>>>> #./x86_64-softmmu/qemu-system-x86_64 -snapshot -enable-kvm -snapshot \
>>>>  -monitor unix:/tmp/m,server,nowait -device pci-bridge,chassis_nr=1 \
>>>>  -boot menu=on -m 4G -trace events=t ws2012r2x64dc.img \
>>>>  -device ivshmem,id=foo,size=2M,shm,bus=pci.1,addr=01
>>>>
>>>> wait till OS boots, note BARs programmed for ivshmem
>>>>  in my case it was
>>>>    01:01.0 0,0xfe800000+0x100
>>>> then execute script and watch pci_update_mappings* trace events
>>>>
>>>> # for i in $(seq 3 18); do printf -- "device_add 
>>>> e1000,bus=pci.1,addr=%x\n" $i | nc -U /tmp/m; sleep 5; done;
>>>>
>>>> hotplugging e1000,bus=pci.1,addr=12 triggers rebalancing where
>>>> Windows unmaps all BARs of nics on bridge but doesn't touch ivshmem
>>>> and then programs new BARs, where:
>>>>   pci_update_mappings_add d=0x7fa02ff0cf90 01:11.0 0,0xfe800000+0x20000
>>>> creates overlapping BAR with ivshmem 
>>>
>>> Michael informed me of this on IRC (and forwarded this email to me). I hope 
>>> to start a new thread with my response. (I also reedited the subject fully.)
>>>
>>> So, to summarize what I said on IRC first. The situation where firmware 
>>> recognizes and enables a PCI device, hands control to the OS, and then the 
>>> OS lacks a driver for the PCI device, is completely normal and expected. 
>>> For UEFI specifically, I can name a general argument and a specific 
>>> argument.
>>>
>>> The general argument is that actions that need to be taken in 
>>> ExitBootServices() callbacks do not include clearing IO or MMIO decode bits 
>>> in PCI device command registers. Command register manipulation happens when 
>>> a PCI device driver (that conforms to the UEFI driver model) *binds* or 
>>> *unbinds* a device. And unbinding a device is not possible in the 
>>> ExitBootServices() callback, minimally because such callbacks are forbidden 
>>> from modifying the memory map -- but unbinding would release allocated 
>>> memory.
>>>
>>> So what we use such callbacks for is aborting in-flight, outstanding 
>>> DMA-like transfers. Re-setting virtio devices is also an example (think 
>>> outstanding receive requests for virtio-net).
>>>
>>> Now let's move on to the specific argument I mentioned above. The Graphics 
>>> Output Protocol (GOP) is a UEFI abstraction that was specifically designed 
>>> with the case in mind when the operating system doesn't have a display 
>>> driver -- yet installed --, but the user obviously has to use the display 
>>> somehow. The GOP is most frequently provided on top of an 
>>> EFI_PCI_IO_PROTOCOL instance; meaning simply that the "GOP driver" is a 
>>> UEFI driver that drives a PCI device. In short, the driver provides the GOP 
>>> on top of a PCI device.
>>>
>>> Now, the GOP is supposed to communicate the pixel format and the frame 
>>> buffer base address for the currently active graphics mode to the software 
>>> that consumes the GOP. This includes UEFI applications of course (think a 
>>> boot loader putting up a splash screen or an anmiation), but importantly, 
>>> the runtime OS is *also* supposed to inherit these characteristics from 
>>> boot services time. The OS can then use simple unaccelerated MMIO writes to 
>>> display things on the screen, until the users installs an accelerated 
>>> driver.
>>>
>>> (Concrete example: this is why you can see *anything at all* on the screen, 
>>> when you run e.g. Windows Server 2012 R2 on top of OVMF and a QXL display, 
>>> before installing the QXL WDDM driver in the guest.)
>>>
>>> Clearly, the frame buffer base address communicated through the GOP points 
>>> into one of the MMIO BARs of the PCI device. If, at ExitBootServices(), 
>>> MMIO decoding were disabled for the PCI device that underlies the GOP, that 
>>> would *completely* defeat the GOP design. The OS's attempt to poke at those 
>>> MMIO addresses would be futile -- and in fact the OS has no idea what PCI 
>>> device (if any) the framebuffer is supposed to be related to. This is the 
>>> jurisdiction of the OS-level display driver -- if one exists and is 
>>> installed.
>>>
>>> So, this is a Windows bug in my option. Just because there is no OS-level 
>>> driver, a PCI device is fully expected to be decoding resources, if the 
>>> firmware brought it up.
>>>
>>> --*--
>>>
>>> Okay, so Michael asked me to try to reproduce the above with OVMF, and see 
>>> what happens. Unfortunately I'm not really knowledgeable about ivshmem, 
>>> hotplug, et cetera. Let me instead tell Igor about using OVMF.
>>>
>>> (1) Please follow the instructions on Gerd's page 
>>> <https://www.kraxel.org/repos/>, and install the "edk2.git-ovmf-x64" 
>>> package.
>>>
>>> (2) Create a separate directory for testing. In this directory, run the 
>>> following command:
>>>
>>>   cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd myvars.fd
>>>
>>> Also create a disk image for your new guest, etc.
>>>
>>> (3) Use the following command line snippet to work with OVMF:
>>>
>>>      qemu-system-x86_64 \
>>>        -machine accel=kvm \
>>>        -smp cpus=2 \
>>>        -m 2048 \
>>>        \
>>>        -debugcon file:ovmf.debug.log \
>>>        -global isa-debugcon.iobase=0x402 \
>>>        \
>>>        -device qxl-vga \
>>>        \
>>>        -drive 
>>> if=pflash,format=raw,unit=0,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd
>>>  \
>>>        -drive if=pflash,format=raw,unit=1,file=myvars.fd \
>>>        \
>>>        [your options here]
>>>
>>> You can of course customize the # of VCPUs, memory size, disks, CD-ROMs, 
>>> network, and so on.
>>>
>>> Recommended: when you use the -device option to add the disk and the 
>>> CD-ROM(s) to install the OS (and driver(s)) from, be sure to use the 
>>> "bootindex" property. OVMF will adhere to the boot order. It is recommended 
>>> to set bootindex=0 for your main disk, bootindex=1 for your OS installer 
>>> CD-ROM, and *no* bootindex for your virtio-win driver disk. This way at 
>>> first boot (with no OS installed) OVMF will boot the installer CD-ROM. 
>>> Further boots (with the same command line) will boot the installed OS.
>>>
>>> Caveat: I never used the -snapshot option with OVMF virtual machines; it 
>>> might or might not work.
>>>
>>> Caveat #2: I had tested simple PCI hotplug and hot-unplug with Windows 
>>> running on OVMF many months ago, but I can't tell off-hand if it will work 
>>> right now.
>>
>> I should also mention that you might not be able to reproduce the same
>> situation with the "ivshmem" device. Namely, if there is no UEFI driver
>> for that PCI device (and OVMF certainly doesn't have one), then its MMIO
>> and IO decoding bits will *never* be set. As I said, command register
>> massaging is the jurisdiction of the individual UEFI driver that
>> ultimately binds the device -- and OVMF has no UEFI driver for ivshmem.
>>
>> Therefore you should probably try to reproduce the issue with another
>> PCI device type that OVMF has a driver for, but Windows has none
>> (installed at least). I'm quite hard pressed to name such a device type,
>> unfortunately. :(
> 
> virtio?

... was my first thought as well, but OVMF at the moment supports only
legacy (0.9.5) virtio-pci devices (and virtio-mmio only on AARCH64) --
those don't have MMIO BARs, only IO BARs.

Theoretically the Windows overlap issue should be triggerable with IO
BARs just the same (resource - resource, right?), but I doubt it will be
reproducible in practice.

Laszlo

>> Perhaps one of the more obscure emulated NICs could work in place of
>> ivshmem. (The IPXE oproms provide UEFI drivers for those.)
>>
>> Thanks
>> Laszlo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]