qemu-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-discuss] Failing to get PCI pass-through/muli-seat working wit


From: Brian Yglesias
Subject: Re: [Qemu-discuss] Failing to get PCI pass-through/muli-seat working with any multidisk configuration
Date: Tue, 16 Aug 2016 20:06:29 -0700 (PDT)

I forgot to mention that I can assign 2 GPU to 1 VM.  The problem is only with 
two concurrent VM with 1 GPU each.

----- Original Message -----
From: "Brian Yglesias" <address@hidden>
To: "qemu-discuss" <address@hidden>
Sent: Monday, August 15, 2016 5:24:36 AM
Subject: Failing to get PCI pass-through/muli-seat working with any multidisk 
configuration

Hello everyone.

It seems the only way I can multi-seat to work is by having the OS and the VMs 
on a single disk, and after weeks of futility I'm starting to wonder if I can 
even replicate that.

I have two VMs which work surprisingly well with VFIO/IOMMU, unless I run them 
concurrently.  If I do, then the display driver will crash on one VM followed 
shortly by the other.  I've replicated this problem with multiple kernels from 
4.2.1 to 4.7.X, and on two X58/LGA1366 MBs, so I suspect it affects most or all 
of them, at least when used with Debian / Proxmox.

There is nothing in the system logs to indicate why.

Here are the specs on the system I'm currently working on.

Distro:  Debian 8 / Proxmox 4.2
MB:  Asus Rampage III
CPU:  Xeon X5670
RAM:  24 GB

DISK1:  OS - XFS/LVM
DISK2-4:  VMs - ZFS RAIDZ-1

I've also seen the same on a GA-EX58 mb, set up identically.


I've tried ZFS, MDADM with and without LVM, I've tried MDADM raids 5, 1, and 
even 0.

I thought for sure that in the worst case scenario I would be able to assign a 
VM per disk.  Not so.

Oddly, it's actually gotten worse in that before I would need to start 
something 3D on both VMs in order to reliably crash both VMs (within seconds of 
each other usually).  Now all I need to do is start the second one, and the 
display driver will crash on the first one. (The fact that both VMs always 
crash has to be indicative of something, but not sure what.)

I'm pretty much back at the drawing board.  I'm actually starting to doubt that 
my 'single disk test' really worked.  Maybe I just didn't run it long enough?  
So I will try that again.  Unfortunately, I only have spindle disks large 
enough to hold everything on hand right now, so it won't be an exact replica.

Beyond that, I really don't know.  I currently have the system set up in almost 
the most basic way I can to have something acceptable:
-OS on a single 120 GB SSD
-VM Root Pool on 3 240 GB SSD, Raid Z-1

Soft rebooting a VM will always cause that VM's display to get garbled on POST. 
 I don't even have to get into Windows, if that happens I know the VM is beyond 
salvation, and the second one is going down too.

I'm beginning to think this is somehow tied to my X58 chipset mbs (happens 
identically on both a Gigabyte and Asus board with that chipset), or the 
qemu/kvm that comes with Proxmox.  A third possibility may be some 
server-oriented tuning cooked into Proxmox.  (Maybe I'll do single disk this 
time with regular Debian, and see if there is some change.)

Proxmox has a bug which sets HV_Vendor_ID to 'Proxmox' rather than 
'Nvidia43Fix', which causes a Code 43 in the Nvidia Driver (it says in device 
manager:  "has reported a problem and has been stopped", or some such).  As a 
result I launch the VMs from the console based on the tweaked output of 'qm 
showcmd <vmid>':

VM1:

# sed -e 's/#.*$//' -e '/^$/d' /root/src/brian.1
/usr/bin/systemd-run \
--scope \
--slice qemu \
--unit 110 \
-p KillMode=none \
-p CPUShares=250000 \
/usr/bin/kvm -id 110 \
-chardev socket,id=qmp,path=/var/run/qemu-server/110.qmp,server,nowait \
-mon chardev=qmp,mode=control \
-pidfile /var/run/qemu-server/110.pid \
-daemonize \
-smbios type=1,uuid=6a9ea4a2-48bd-415e-95fb-adf8c9db44f7 \
-drive if=pflash,format=raw,readonly,file=/usr/share/kvm/OVMF-pure-efi.fd \
-drive if=pflash,format=raw,file=/root/sbin/110-OVMF_VARS-pure-efi.fd \
-name Brian-PC \
-smp 12,sockets=1,cores=12,maxcpus=12 \
-nodefaults \
-boot menu=on,strict=on,reboot-timeout=1000 \
-vga none \
-nographic \
-no-hpet \
-cpu 
host,hv_vendor_id=Nvidia43FIX,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_relaxed,+kvm_pv_unhalt,+kvm_pv_eoi,kvm=off
 \
-m 8192 \
-object memory-backend-ram,size=8192M,id=ram-node0 \
-numa node,nodeid=0,cpus=0-11,memdev=ram-node0 \
-k en-us \
-readconfig /usr/share/qemu-server/pve-q35.cfg \
-device usb-tablet,id=tablet,bus=ehci.0,port=1 \
-device vfio-pci,host=04:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0 \
-device vfio-pci,host=04:00.1,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0 \
-device usb-host,hostbus=1,hostport=6.1 \
-device usb-host,hostbus=1,hostport=6.2.1 \
-device usb-host,hostbus=1,hostport=6.2.2 \
-device usb-host,hostbus=1,hostport=6.2.3 \
-device usb-host,hostbus=1,hostport=6.2 \
-device usb-host,hostbus=1,hostport=6.3 \
-device usb-host,hostbus=1,hostport=6.4 \
-device usb-host,hostbus=1,hostport=6.5 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \
-drive 
file=/dev/zvol/SSD-pool/vm-110-disk-1,if=none,id=drive-virtio0,cache=writeback,format=raw,aio=threads,detect-zeroes=on
 \
-device 
virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100 \
-netdev 
type=tap,id=net0,ifname=tap110i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on
 \
-device 
virtio-net-pci,mac=32:61:36:63:37:64,netdev=net0,bus=pci.0,addr=0x12,id=net0 \
-rtc driftfix=slew,base=localtime \
-machine type=q35 \
-global kvm-pit.lost_tick_policy=discard

VM2:

# sed -e 's/#.*$//' -e '/^$/d' /root/src/madzia.2
/usr/bin/systemd-run \
--scope \
--slice qemu \
--unit 111 \
-p KillMode=none \
-p CPUShares=250000 \
/usr/bin/kvm \
-id 111 \
-chardev socket,id=qmp,path=/var/run/qemu-server/111.qmp,server,nowait \
-mon chardev=qmp,mode=control \
-pidfile /var/run/qemu-server/111.pid \
-daemonize \
-smbios type=1,uuid=55d862f4-d9b9-40ab-9b0a-e1eadf874750 \
-drive if=pflash,format=raw,readonly,file=/usr/share/kvm/OVMF-pure-efi.fd \
-drive if=pflash,format=raw,file=/root/sbin/111-OVMF_VARS-pure-efi.fd \
-name Madzia-PC \
-smp 12,sockets=1,cores=12,maxcpus=12 \
-nodefaults \
-boot menu=on,strict=on,reboot-timeout=1000 \
-vga none \
-nographic \
-no-hpet \
-cpu 
host,hv_vendor_id=Nvidia43FIX,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_relaxed,+kvm_pv_unhalt,+kvm_pv_eoi,kvm=off
 \
-m 8192 \
-object memory-backend-ram,size=8192M,id=ram-node0 \
-numa node,nodeid=0,cpus=0-11,memdev=ram-node0 \
-k en-us \
-readconfig /usr/share/qemu-server/pve-q35.cfg \
-device usb-tablet,id=tablet,bus=ehci.0,port=1 \
-device vfio-pci,host=05:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0 \
-device vfio-pci,host=05:00.1,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0 \
-device usb-host,hostbus=2,hostport=2.1 \
-device usb-host,hostbus=2,hostport=2.2 \
-device usb-host,hostbus=2,hostport=2.3 \/
-device usb-host,hostbus=2,hostport=2.4 \
-device usb-host,hostbus=2,hostport=2.5 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \
-iscsi initiator-name=iqn.1993-08.org.debian:01:1530d013b944 \
-drive 
file=/dev/zvol/SSD-pool/vm-111-disk-1,if=none,id=drive-virtio0,cache=writeback,format=raw,aio=threads,detect-zeroes=on
 \
-device 
virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100 \
-netdev 
type=tap,id=net0,ifname=tap111i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on
 \
-device 
virtio-net-pci,mac=4E:F0:DD:90:DB:2D,netdev=net0,bus=pci.0,addr=0x12,id=net0 \
-rtc driftfix=slew,base=localtime \
-machine type=q35 \
-global kvm-pit.lost_tick_policy=discard

However, I've tried many invocations of KVM without success.

Here is how I load my modules:


# cat /etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1



# cat /etc/modprobe.d/vfio_pci.conf
options vfio_pci disable_vga=1
#install vfio_pci /root/sbin/vfio-pci-override-vga.sh
options vfio-pci ids=10de:13c2,10de:0fbb,10de:11c0,10de:0e0b



# cat /etc/modprobe.d/zfs.conf
options zfs zfs_arc_max=4299967296



# cat /etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1


... I believe grub is set up correctly ...


# sed -e 's/#.*$//' -e '/^$/d' /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on 
vfio_iommu_type1.allow_unsafe_interrupts=1 quiet"
GRUB_CMDLINE_LINUX=""
GRUB_DISABLE_OS_PROBER=true
GRUB_DISABLE_RECOVERY="true"


...  I believe I have all the correct modules loaded on boot ...


# sed -e 's/#.*$//' -e '/^$/d' /etc/modules
coretemp
it87
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd


... Here's the Q35 config file ...


# sed -e 's/#.*$//' -e '/^$/d' /usr/share/qemu-server/pve-q35.cfg
[device "ehci"]
  driver = "ich9-usb-ehci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.7"
[device "uhci-1"]
  driver = "ich9-usb-uhci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.0"
  masterbus = "ehci.0"
  firstport = "0"
[device "uhci-2"]
  driver = "ich9-usb-uhci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.1"
  masterbus = "ehci.0"
  firstport = "2"
[device "uhci-3"]
  driver = "ich9-usb-uhci3"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.2"
  masterbus = "ehci.0"
  firstport = "4"
[device "ehci-2"]
  driver = "ich9-usb-ehci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.7"
[device "uhci-4"]
  driver = "ich9-usb-uhci4"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.0"
  masterbus = "ehci-2.0"
  firstport = "0"
[device "uhci-5"]
  driver = "ich9-usb-uhci5"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.1"
  masterbus = "ehci-2.0"
  firstport = "2"
[device "uhci-6"]
  driver = "ich9-usb-uhci6"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.2"
  masterbus = "ehci-2.0"
  firstport = "4"
[device "audio0"]
  driver = "ich9-intel-hda"
  bus = "pcie.0"
  addr = "1b.0"
[device "ich9-pcie-port-1"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.0"
  port = "1"
  chassis = "1"
[device "ich9-pcie-port-2"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.1"
  port = "2"
  chassis = "2"
[device "ich9-pcie-port-3"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.2"
  port = "3"
  chassis = "3"
[device "ich9-pcie-port-4"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.3"
  port = "4"
  chassis = "4"
[device "pcidmi"]
  driver = "i82801b11-bridge"
  bus = "pcie.0"
  addr = "1e.0"
[device "pci.0"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "1.0"
  chassis_nr = "1"
[device "pci.1"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "2.0"
  chassis_nr = "2"
[device "pci.2"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "3.0"
  chassis_nr = "3"

... and plenty of CPU ...


# cat /proc/cpuinfo | grep -A 5 processor . "\\: 11"
# cat /proc/cpuinfo | grep  -A 4 processor.*": 11"
processor       : 11
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X 000  @ 2.93GHz


If anyone has any suggestions, I would greatly appreciate it.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]