qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] VFIO VGA test branches


From: Alex Williamson
Subject: Re: [Qemu-devel] VFIO VGA test branches
Date: Sun, 19 May 2013 21:44:25 -0600

On Fri, 2013-05-17 at 01:09 -0700, Justin Gottula wrote:
> Hi Alex,
> 
> VGA passthrough is working great here, with the exception of device reset.
> 
> In short, everything works the first time the guest runs. But the second
> time I start the guest, before anything comes on the screen, the host
> grinds to a halt and freezes (gradually, until after a few moments, magic
> sysrq doesn't even work). No 'reduced performance' here, just a completely
> frozen system.
> 
> Suspending and waking the host in between guest runs, while inconvenient,
> completely avoids the problem. No more freezes and full graphics
> performance in the guest. So my guess is that the PCI reset from software
> just isn't happening for some reason.
> 
> The passthru devices are being assigned to pci-stub and vfio-pci as they
> should be. Secondary passthrough (-vga cirrus or -vga std) doesn't change
> much: things still work, and reset still doesn't. One small difference is
> that the freeze on second boot is delayed until Windows initializes the
> secondary graphics adapter, since the device isn't touched by the BIOS
> prior to that.
> 
> Overriding the video BIOS doesn't seem to change anything. Passing through
> just the VGA device (excluding the HDMI audio device) doesn't seem to make
> much of a difference either.
> 
> - hardware
> ASUS M5A99X EVO (AMD 990X/SB950; the IVRS is broken and overriden)
> AMD Radeon HD 5750 (for the host)
> AMD Radeon HD 7870 (for passthru)
> 
> - software: host
> linux (Joerg Roedel's iommu tree) with linux-vfio merged in
> qemu (latest git with vfio)

Are you dependent on Joerg's tree for the IVRS fixup?  It would be
preferable to start with just my vfio-vga-reset branches before adding
more variables.  Also, be sure you're using the correct branch to get
the PCI bus reset code.  You can verify with something like:

grep VFIO_DEVICE_PCI_BUS_RESET qemu.git/hw/misc/vfio.c
grep VFIO_DEVICE_PCI_BUS_RESET linux.git/drivers/vfio/pci/vfio_pci.c

I have seen timer messages from the host and they can cause the host to
become very unresponsive.  I'm not sure I've seen a full freeze though.
In one case I've also seen this with the PCI bus reset code where the
bus didn't return to a working state (the reason I haven't reposted the
PCI changes upstream yet).  When I saw this it was in a system where
resetting another slot works quite well, so when I get a chance to look
at it again, I'll probably start with seeing if the problem is unique to
the slot.  If you can manage to get the system to run lspci once it's in
this state you can tell if the bus is still in reset if the devices
report ref FF and get an unknown header type 7F for those devices.

> seabios (latest git)
> 
> - software: guest
> windows 8
> amd catalyst 13.5b2
> virtio drivers
> 
> - lspci (abbreviated)
> 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI
> bridge (external gfx0 port B) (rev 02)
> 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory
> Management Unit (IOMMU)
> 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI
> bridge (PCI express gpp port B)
> 00:03.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI
> bridge (PCI express gpp port C)
> 00:14.0 SMBus: Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller
> (rev 42)
> 01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI
> Juniper [Radeon HD 5700 Series]
> 01:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Juniper HDMI
> Audio [Radeon HD 5700 Series]
> 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI
> Pitcairn [Radeon HD 7800]
> 02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cape
> Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> 
> - lspci -n (abbreviated)
> 01:00.0 0300: 1002:68be
> 01:00.1 0403: 1002:aa58
> 02:00.0 0300: 1002:6818
> 02:00.1 0403: 1002:aab0
> 
> - lspci -t (abbreviated)
> -[0000:00]-+-00.0
>            +-00.2
>            +-02.0-[01]--+-00.0
>            |            \-00.1
>            +-03.0-[02]--+-00.0
>                         \-00.1
> 
> - kernel cmdline
> /vmlinuz-linux-iommu initrd=/initramfs-linux-iommu.img
> root=/dev/CorsairVG/ArchLinux rootflags=subvol=root
> systemd.unit=graphical.target debug nomodeset vga=804 acpi.debug_level=0x2
> acpi.debug_layer=0xFFFFFFFF amd_iommu_dump
> vfio_iommu_type1.allow_unsafe_interrupts=1 ivrs_ioapic[9]=00:14.0
> ivrs_ioapic[10]=00:00.1
> 
> - qemu options
> qemu-system-x86_64 -enable-kvm -name Windows8 \
>  -M q35 -nodefconfig -readconfig /pool/KVM/Windows8/q35-chipset.cfg \
>  -m 4096 -balloon none \
>  -rtc base=localtime \
>  -cpu host -smp 8,sockets=1,cores=4,threads=2 \
>  -bios /usr/share/qemu/bios.bin \
>  -vga none \
>  -drive
> if=virtio,format=raw,discard=on,cache=none,file=/dev/CorsairVG/Windows \
>  -drive if=virtio,format=raw,file=/pool/KVM/Windows8/WinData.ntfs.img \
>  -drive id=cdrom,media=cdrom,format=raw,file=/dev/null \
>  -device ide-cd,bus=ide.0,drive=cdrom \
>  -boot order=dc,menu=on \
>  -net nic,model=virtio,macaddr=00:55:aa:00:00:01 -net bridge,br=vm_br \
>  -soundhw hda \
>  -usbdevice tablet \
>  -device
> vfio-pci,host=02:00.0,bus=ich9-pcie-port-1,addr=0.0,multifunction=on,x-vga=on
> \
>  -device vfio-pci,host=02:00.1,bus=ich9-pcie-port-1,addr=0.1
> 
> - dmesg | egrep -i '(iommu|ioapic|pci-stub|vfio)' | grep -vi 'command line'
> [    0.000000] ACPI: IOAPIC (id[0x09] address[0xfec00000] gsi_base[0])
> [    0.000000] IOAPIC[0]: apic_id 9, version 33, address 0xfec00000, GSI
> 0-23
> [    0.000000] ACPI: IOAPIC (id[0x0a] address[0xfec20000] gsi_base[24])
> [    0.000000] IOAPIC[1]: apic_id 10, version 33, address 0xfec20000, GSI
> 24-55
> [    0.223213] AMD-Vi:   DEV_SPECIAL(IOAPIC[0])         devid: 00:14.0
> [    0.223219] AMD-Vi:   DEV_SPECIAL(IOAPIC[255])               devid:
> 00:00.1
> [    0.600863] ACPI: Using IOAPIC for interrupt routing
> [    2.033551] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
> [    2.191528] VFIO - User Level meta-driver version: 0.3
> [    3.601002] pci-stub: add 1002:6818 sub=FFFFFFFF:FFFFFFFF
> cls=00000000/00000000
> [    3.607895] pci-stub 0000:02:00.0: claimed by stub
> [    3.614688] pci-stub: add 1002:AAB0 sub=FFFFFFFF:FFFFFFFF
> cls=00000000/00000000
> [    3.621640] pci-stub 0000:02:00.1: claimed by stub

2:00.0/1 is added to pci-stub here, but used by vfio-pci below.  Is
pci-stub just temporary to keep radeon from binding to it?

> [ 5137.969990] vfio-pci 0000:02:00.0: enabling device (0000 -> 0003)
> [ 5137.995842] vfio_ecap_init: 0000:02:00.0 hiding ecap address@hidden
> [ 5137.995849] vfio_ecap_init: 0000:02:00.0 hiding ecap address@hidden
> [ 5166.727114] vfio-pci 0000:02:00.0: irq 93 for MSI/MSI-X
> (this is from before the second boot attempt)
> 
> - last two lines from dmesg before the freeze (netcat'd to another box)
> Clocksource tsc unstable (delta = -416526709 ns)
> Switching to clocksource hpet
> 
> - output with DEBUG_VFIO: lots and lots, see attachment
> 
> If you need any more information, I'll be glad to provide it.

Thanks,

Alex





reply via email to

[Prev in Thread] Current Thread [Next in Thread]