[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Multi GPU passthrough via VFIO
From: |
Alex Williamson |
Subject: |
Re: [Qemu-devel] Multi GPU passthrough via VFIO |
Date: |
Wed, 05 Feb 2014 14:27:54 -0700 |
On Wed, 2014-02-05 at 22:10 +0100, Maik Broemme wrote:
> Hi Alex,
>
> Alex Williamson <address@hidden> wrote:
> > On Wed, 2014-02-05 at 19:59 +0100, Maik Broemme wrote:
> > > Hi,
> > >
> > > currently VFIO with multi GPU passthrough is working partially and
> > > hopefully somebody has a hint about the problem. I'm doing passthrough
> > > of an AMD Radeon R9 290X and AMD Radeon 7870 GHz Edition to a single VM.
> > >
> > > If the VM is running Linux this works quite well with radeon or fglrx
> > > driver. Please see 'dmesg' log attached, when using the radeon driver.
> > > If needed I can also post one with fglrx driver.
> > >
> > > If I do the exact same passthrough to a Windows VM and use latest AMD
> > > Catalyst 14.1 (2/1/2014) or AMD Catalyst 13.12 (12/18/2013) I can get
> > > only the first device working (AMD R9 290X) with 'x-vga=on'. I don't
> > > enable 'x-vga=on' on second device as this should never work. :)
> >
> > Why not? The guest is able to change the VGA enable bit in the emulated
> > bridge registers and access VGA space of each device, just like happens
> > on bare metal. You'll only get one device initialized from seabios, but
> > that's the same as would happen on bare metal as well.
> >
>
> Well it was just my guess as it would behave like most physical boxes
> in this case. :)
>
> > > I see
> > > BIOS boot screen and everything works fine except for the second GPU.
> > > The windows device manager just show me "Code 12" for the second GPU
> > > and its HD Audio device. Code 12 means: "This device cannot find enough
> > > free resources that it can use".
> >
> > I've seen the same using Nvidia GRID GPUs (w/o x-vga=on), but only with
> > the Q35 chipset model, Linux works, Windows reports Code 12. I have no
> > idea why as all the PCI resources appear to be properly sized and
> > mapped. FWIW, 2 GRID GPUs assigned to a guest do work with the 440FX
> > chipset model. Beyond 2 we run out of MMIO resources below 4G and
> > something bad happens.
> >
>
> Interesting. I will try 440FX a bit later and see if this works. What I
> can also do is to post system resource conflicts from Windows, the AMD
> Catalyst Center has it integrated. Maybe this will help?
If you actually see conflicts, then yes. The Code 12 I've seen I was
never able to identify a conflict. The trouble with 440FX is that
you'll need to use pci-bridges to isolate VGA space of each GPU.
Otherwise one card would need to be disabled to ensure the VGA accesses
go to the other.
> > > QEMU is called in both cases via the following. I just replace the
> > > '-drive' accordingly.
> > >
> > > /usr/bin/taskset -c 0,1,2,3 /usr/bin/qemu-system-x86_64 \
> > > -machine q35,accel=kvm \
> > > -enable-kvm \
> > > -nodefaults \
> > > -nographic \
> > > -vga none \
> > > -boot order=nc \
> > > -cpu host \
> > > -smp cores=4,threads=1,sockets=1 \
> > > -m 8192 \
> > > -rtc base=localtime \
> > > -k de \
> > > -drive
> > > file=/srv/kvm/linux-drive0.img,id=drive0,if=none,cache=none,aio=threads \
> > > -mon chardev=monitor0 \
> > > -chardev socket,id=monitor0,path=/tmp/linux.monitor,nowait,server \
> > > -netdev tap,id=net0,vhost=on,helper=/usr/lib/qemu/qemu-bridge-helper \
> > > -device virtio-net-pci,netdev=net0,mac=00:00:00:02:01:04 \
> > > -device virtio-blk-pci,drive=drive0,ioeventfd=on \
> > > -device ioh3420,bus=pcie.0,id=pcie0,port=1,chassis=1,multifunction=on \
> > > -device ioh3420,bus=pcie.0,id=pcie1,port=2,chassis=2,multifunction=on \
> > > -device
> > > vfio-pci,host=01:00.0,addr=00.0,bus=pcie0,multifunction=on,x-vga=on \
> > > -device vfio-pci,host=01:00.1,addr=00.1,bus=pcie0 \
> > > -device vfio-pci,host=02:00.0,addr=00.0,bus=pcie1,multifunction=on \
> > > -device vfio-pci,host=02:00.1,addr=00.1,bus=pcie1 \
> > > -no-reboot
> > >
> > > My setup is the following:
> > >
> > > Kernel: linux-3.13.1
> > > Seabios: seabios-git-rel.1.7.4.r51.g151d034 (5/2/2014)
> > > QEMU: qemu-git-2.0.r30666.g31db5b3 (5/2/2014)
> > >
> > > Below is the 'lspci' output and I'm using the AMD Radeon HD 5430 as device
> > > for my local X server:
> > >
> > > 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to
> > > PCI bridge (external gfx0 port B) (rev 02)
> > > 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory
> > > Management Unit (IOMMU)
> > > 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to
> > > PCI bridge (PCI express gpp port B)
> > > 00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to
> > > PCI bridge (PCI express gpp port D)
> > > 00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to
> > > PCI bridge (PCI express gpp port H)
> > > 00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to
> > > PCI bridge (external gfx1 port B)
> > > 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> > > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus
> > > Controller (rev 42)
> > > 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia
> > > (Intel HDA) (rev 40)
> > > 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
> > > 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to
> > > PCI Bridge (rev 40)
> > > 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
> > > 00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
> > > 00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1)
> > > 00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to
> > > PCI bridge (PCIE port 2)
> > > 00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to
> > > PCI bridge (PCIE port 3)
> > > 00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > 00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h
> > > Processor Function 0
> > > 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h
> > > Processor Function 1
> > > 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h
> > > Processor Function 2
> > > 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h
> > > Processor Function 3
> > > 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h
> > > Processor Function 4
> > > 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h
> > > Processor Function 5
> > > 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > Hawaii XT [Radeon HD 8970]
> > > 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
> > > 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > Pitcairn XT [Radeon HD 7870 GHz Edition]
> > > 02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape
> > > Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> > > 03:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host
> > > Controller (rev 01)
> > > 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > > Park [Mobility Radeon HD 5430]
> > > 04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI
> > > Audio [Radeon HD 5400/6300 Series]
> > > 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> > > RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
> > > 07:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host
> > > Controller (rev 01)
> > >
> > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > in QEMU. The 7870 is doing the reset properly.
> >
> >
> > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > chance? Thanks,
> >
>
> Here are both. It is funny it is opposite as you described. :)
Oops, yes. Does this help?
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
QLIST_FOREACH(group, &group_list, next) {
QLIST_FOREACH(vdev, &group->device_list, next) {
- if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
+ if (!vdev->reset_works || !vdev->has_flr) {
vdev->needs_reset = true;
}
}
I can't figure out why I coded it the way that I did. Probably overly
targeting a specific device. Thanks,
Alex
> address@hidden:~# lspci -vvv -s 01:00.0 | grep NoSoftRst
> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>
> address@hidden:~# lspci -vvv -s 02:00.0 | grep NoSoftRst
> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>
> address@hidden:~# lspci -vvv -s 01:00.0
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
> Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR+ FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 49
> Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
> Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
> Region 4: I/O ports at be00 [size=256]
> Region 5: Memory at fdd80000 (32-bit, non-prefetchable) [size=256K]
> [virtual] Expansion ROM at d0000000 [disabled] [size=128K]
> Capabilities: [48] Vendor Specific Information: Len=08 <?>
> Capabilities: [50] Power Management version 3
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
> PME(D0-,D1+,D2+,D3hot+,D3cold-)
> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1
> unlimited
> ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
> Unsupported-
> RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> MaxPayload 128 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
> TransPend-
> LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit
> Latency L0s <64ns, L1 <1us
> ClockPM- Surprise- LLActRep- BwNot-
> LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
> DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-,
> OBFF Not Supported
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
> OBFF Disabled
> LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
> Compliance De-emphasis: -6dB
> LnkSta2: Current De-emphasis Level: -3.5dB,
> EqualizationComplete-, EqualizationPhase1-
> EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
> Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> Address: 00000000fee00000 Data: 0000
> Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1
> Len=010 <?>
> Capabilities: [150 v2] Advanced Error Reporting
> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> Capabilities: [270 v1] #19
> Capabilities: [2b0 v1] Address Translation Service (ATS)
> ATSCap: Invalidate Queue Depth: 00
> ATSCtl: Enable+, Smallest Translation Unit: 00
> Capabilities: [2c0 v1] #13
> Capabilities: [2d0 v1] #1b
> Kernel driver in use: vfio-pci
> Kernel modules: radeon
>
> address@hidden:~# lspci -vvv -s 02:00.0
> 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> Pitcairn XT [Radeon HD 7870 GHz Edition] (prog-if 00 [VGA controller])
> Subsystem: XFX Pine Group Inc. Device 3251
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR+ FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 48
> Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
> Region 2: Memory at fda80000 (64-bit, non-prefetchable) [size=256K]
> Region 4: I/O ports at ee00 [size=256]
> [virtual] Expansion ROM at fda00000 [disabled] [size=128K]
> Capabilities: [48] Vendor Specific Information: Len=08 <?>
> Capabilities: [50] Power Management version 3
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
> PME(D0-,D1+,D2+,D3hot+,D3cold-)
> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1
> unlimited
> ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
> Unsupported-
> RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> MaxPayload 128 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
> TransPend-
> LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit
> Latency L0s <64ns, L1 <1us
> ClockPM- Surprise- LLActRep- BwNot-
> LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive-
> BWMgmt- ABWMgmt-
> DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-,
> OBFF Not Supported
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
> OBFF Disabled
> LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
> Compliance De-emphasis: -6dB
> LnkSta2: Current De-emphasis Level: -3.5dB,
> EqualizationComplete-, EqualizationPhase1-
> EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
> Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> Address: 00000000fee00000 Data: 0000
> Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1
> Len=010 <?>
> Capabilities: [150 v2] Advanced Error Reporting
> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> Capabilities: [270 v1] #19
> Capabilities: [2b0 v1] Address Translation Service (ATS)
> ATSCap: Invalidate Queue Depth: 00
> ATSCtl: Enable+, Smallest Translation Unit: 00
> Capabilities: [2c0 v1] #13
> Capabilities: [2d0 v1] #1b
> Kernel driver in use: vfio-pci
> Kernel modules: radeon
>
> > Alex
> >
>
> --Maik
- [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/05
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Alex Williamson, 2014/02/05
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/05
- Re: [Qemu-devel] Multi GPU passthrough via VFIO,
Alex Williamson <=
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/05
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/05
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Alex Williamson, 2014/02/05
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/06
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/07
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Alex Williamson, 2014/02/07
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/07
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/13
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Alex Williamson, 2014/02/13
- Re: [Qemu-devel] Multi GPU passthrough via VFIO, Maik Broemme, 2014/02/14