qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Guest IOMMU and Cisco usnic


From: Benoît Canet
Subject: Re: [Qemu-devel] Guest IOMMU and Cisco usnic
Date: Wed, 12 Feb 2014 23:38:35 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

The Wednesday 12 Feb 2014 à 12:34:25 (-0700), Alex Williamson wrote :
> On Wed, 2014-02-12 at 19:10 +0100, Benoît Canet wrote:
> > Hi Alex,
> > 
> > After the IRC conversation we had a few days ago I understood that guest 
> > IOMMU
> > was not implemented.
> > 
> > I have a real use case for it:
> > 
> > Cisco usnic allow to write MPI applications while driving the network card 
> > in
> > userspace in order to optimize the latency. It's made for compute clusters.
> > 
> > The typical cloud provider don't provide bare metal access but only vms on 
> > top
> > of Cisco's hardware hence VFIO is using the IOMMU to passthrough the NIC to 
> > the
> > guest and no IOMMU is present in the guest.
> > 
> > questions: Would writing a performing guest IOMMU implementation be 
> > possible ?
> >            How complex this project looks for someone knowing IOMMUs issues 
> > ?
> > 
> > The ideal implementation would forward the IOMMU work to the host hardware 
> > for
> > speed.
> > 
> > I can devote time writing the feature if it's doable.
> 
> Hi Benoît,
> 
> I imagine it's doable, but it's certainly not trivial, beyond that I
> haven't put much thought into it.

Thanks for the anwser.
I am afraid when an expert of the field says "not trivial" :)

Best regards

Benoît

> 
> VFIO running in a guest would need an IOMMU that implements both the
> IOMMU API and IOMMU groups.  Whether that comes from an emulated
> physical IOMMU (like VT-d) or from a new paravirt IOMMU would be for you
> to decide.  VT-d would imply using a PCIe chipset like Q35 and trying to
> bandage on VT-d or updating Q35 to something that natively supports
> VT-d.  Getting a sufficiently similar PCIe hierarchy between host an
> guest would also be required.
> 
> The current model of putting all guest devices in a single IOMMU domain
> on the host is likely not what you would want and might imply a new VFIO
> IOMMU backend that is better tuned for separate domains, sparse
> mappings, and low-latency.  VFIO has a modular IOMMU design, so this
> isn't architecturally a problem.  The VFIO user (QEMU) is able to select
> which backend to use and the code is written with supporting multiple
> backends in mind.
> 
> A complication you'll have is that the granularity of IOMMU operations
> through VFIO is at the IOMMU group level, so the guest would not be able
> to easily split devices grouped together on the host between separate
> users in the guest.  That could be modeled as a conventional PCI bridge
> masking the requester ID of devices in the guest such that host groups
> are mirrored as guest groups.
> 
> There might also be more simple "punch-through" ways to do it, for
> instance what if instead of trying to make it work like it does on the
> host we invented a paravirt VFIO interface and the vfio-pv driver in the
> guest populated /dev/vfio as slightly modified passthroughs to the host
> fds.  The guest OS may not even really need to be aware of the device.
> 
> It's an interesting project and certainly a valid use case.  I'd also
> like to see things like Intel's DPDK move to using VFIO, but the current
> UIO DPDK is often used in guests.  Thanks,
> 
> Alex
> 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]