qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release


From: Yang Zhang
Subject: Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)
Date: Wed, 27 Jan 2016 09:52:06 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

On 2016/1/27 6:56, Alex Williamson wrote:
On Tue, 2016-01-26 at 22:39 +0000, Tian, Kevin wrote:
From: Alex Williamson [mailto:address@hidden
Sent: Wednesday, January 27, 2016 6:27 AM

On Tue, 2016-01-26 at 22:15 +0000, Tian, Kevin wrote:
From: Alex Williamson [mailto:address@hidden
Sent: Wednesday, January 27, 2016 6:08 AM



Today KVMGT (not using VFIO yet) registers I/O emulation callbacks to
KVM, so VM MMIO access will be forwarded to KVMGT directly for
emulation in kernel. If we reuse above R/W flags, the whole emulation
path would be unnecessarily long with obvious performance impact. We
either need a new flag here to indicate in-kernel emulation (bias from
passthrough support), or just hide the region alternatively (let KVMGT
to handle I/O emulation itself like today).

That sounds like a future optimization TBH.  There's very strict
layering between vfio and kvm.  Physical device assignment could make
use of it as well, avoiding a round trip through userspace when an
ioread/write would do.  Userspace also needs to orchestrate those kinds
of accelerators, there might be cases where userspace wants to see those
transactions for debugging or manipulating the device.  We can't simply
take shortcuts to provide such direct access.  Thanks,


But we have to balance such debugging flexibility and acceptable performance.
To me the latter one is more important otherwise there'd be no real usage
around this technique, while for debugging there are other alternative (e.g.
ftrace) Consider some extreme case with 100k traps/second and then see
how much impact a 2-3x longer emulation path can bring...

Are you jumping to the conclusion that it cannot be done with proper
layering in place?  Performance is important, but it's not an excuse to
abandon designing interfaces between independent components.  Thanks,


Two are not controversial. My point is to remove unnecessary long trip
as possible. After another thought, yes we can reuse existing read/write
flags:
        - KVMGT will expose a private control variable whether in-kernel
delivery is required;

But in-kernel delivery is never *required*.  Wouldn't userspace want to
deliver in-kernel any time it possibly could?

        - when the variable is true, KVMGT will register in-kernel MMIO
emulation callbacks then VM MMIO request will be delivered to KVMGT
directly;
        - when the variable is false, KVMGT will not register anything.
VM MMIO request will then be delivered to Qemu and then ioread/write
will be used to finally reach KVMGT emulation logic;

No, that means the interface is entirely dependent on a backdoor through
KVM.  Why can't userspace (QEMU) do something like register an MMIO
region with KVM handled via a provided file descriptor and offset,
couldn't KVM then call the file ops without a kernel exit?  Thanks,


Could you elaborate this thought? If it can achieve the purpose w/o
a kernel exit definitely we can adapt to it. :-)

I only thought of it when replying to the last email and have been doing
some research, but we already do quite a bit of synchronization through
file descriptors.  The kvm-vfio pseudo device uses a group file
descriptor to ensure a user has access to a group, allowing some degree
of interaction between modules.  Eventfds and irqfds already make use of
f_ops on file descriptors to poke data.  So, if KVM had information that
an MMIO region was backed by a file descriptor for which it already has
a reference via fdget() (and verified access rights and whatnot), then
it ought to be a simple matter to get to f_ops->read/write knowing the
base offset of that MMIO region.  Perhaps it could even simply use
__vfs_read/write().  Then we've got a proper reference to the file
descriptor for ownership purposes and we've transparently jumped across
modules without any implicit knowledge of the other end.  Could it work?
Thanks,

ioeventfd is a good example.
As i known, all access to the MMIO of IGD is trapped into kernel. Also, the pci config space is emulated by Qemu. Same the for VGA, which is emulated too. I guest interrupt also is emulated(This means we cannot benifit from VT-d pi). The most important is that KVMGT doesn't required hardware IOMMU. As we known, VFIO is for the direct device assignment, but most of thing for KVMGT are emulated, why we should use VFIO for it?

--
best regards
yang



reply via email to

[Prev in Thread] Current Thread [Next in Thread]