qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] About virtio device hotplug in Q35! 【外域邮件.谨慎查阅】


From: Bob Chen
Subject: Re: [Qemu-devel] About virtio device hotplug in Q35! 【外域邮件.谨慎查阅】
Date: Tue, 22 Aug 2017 15:04:55 +0800

Hi,

I got a spec from Nvidia which illustrates how to enable GPU p2p in
virtualization environment. (See attached)

The key is to append the legacy pci capabilities list when setting up the
hypervisor, with a Nvidia customized capability config.

I added some hack in hw/vfio/pci.c and managed to implement that.

Then I found the GPU was able to recognize its peer, and the latency has
dropped. ✅

However the bandwidth didn't improve, but decreased instead. ❌

Any suggestions?


# p2pBandwidthLatencyTest in VM

[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]

Device: 0, Tesla M60, pciBusID: 0, pciDeviceID: 15, pciDomainID:0

Device: 1, Tesla M60, pciBusID: 0, pciDeviceID: 16, pciDomainID:0

Device=0 CAN Access Peer Device=1

Device=1 CAN Access Peer Device=0

P2P Connectivity Matrix

     D\D     0     1

     0     1     1

     1     1     1

Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)

   D\D     0      1

     0 114.04   5.33

     1   5.42 113.91

Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)

   D\D     0      1

     0 113.93   4.13

     1   4.13 119.65

Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)

   D\D     0      1

     0 120.50   5.55

     1   5.55 134.98

Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)

   D\D     0      1

     0 135.45   5.03   # Even worse, used to be 10

     1   5.02 135.30

P2P=Disabled Latency Matrix (us)

   D\D     0      1

     0   5.74  15.61

     1  16.05   5.75

P2P=Enabled Latency Matrix (us)

   D\D     0      1

     0   5.47   8.23   # Improved, used to be 18

     1   8.06   5.46

2017-08-09 4:07 GMT+08:00 Michael S. Tsirkin <address@hidden>:

> On Mon, Aug 07, 2017 at 09:52:24AM -0600, Alex Williamson wrote:
> > I wonder if it has something to do
> > with the link speed/width advertised on the switch port.  I don't think
> > the endpoint can actually downshift the physical link, so lspci on the
> > host should probably still show the full bandwidth capability, but
> > maybe the driver is somehow doing rate limiting.  PCIe gets a little
> > more complicated as we go to newer versions, so it's not quite as
> > simple as exposing a different bit configuration to advertise 8GT/s,
> > x16. Last I tried to do link matching it was deemed too complicated
> > for something I couldn't prove at the time had measurable value.  This
> > might be a good way to prove that value if it makes a difference here.
> > I can't think why else you'd see such a performance difference, but
> > testing to see if the KVM exit rate is significantly different could
> > still be an interesting verification.
>
> It might be easiest to just dust off that patch and see whether it
> helps.
>
> --
> MST
>

Attachment: NVIDIAGPUDirectwithPCIPass-ThroughVirtualization.pdf
Description: Adobe PDF document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]