qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] VFIO and scheduled SR-IOV cards


From: Alex Williamson
Subject: Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
Date: Mon, 03 Jun 2013 12:02:09 -0600

On Mon, 2013-06-03 at 18:33 +0200, Benoît Canet wrote:
> Hello,
> 
> I plan to write a PF driver for an SR-IOV card and make the VFs work with 
> QEMU's
> VFIO passthrough so I am asking the following design question before trying to
> write and push code.
> 
> After SR-IOV being enabled on this hardware only one VF function can be active
> at a given time.

Is this actually an SR-IOV device or are you trying to write a driver
that emulates SR-IOV for a PF?

> The PF host kernel driver is acting as a scheduler.
> It switch every few milliseconds which VF is the current active function while
> disabling the others VFs.
> 
> One consequence of how the hardware works is that the MMR regions of the
> switched off VFs must be unmapped and their io access should block until the 
> VF
> is switched on again.

MMR = Memory Mapped Register?

This seems contradictory to the SR-IOV spec, which states:

        Each VF contains a non-shared set of physical resources required
        to deliver Function-specific
        services, e.g., resources such as work queues, data buffers,
        etc. These resources can be directly
        accessed by an SI without requiring VI or SR-PCIM intervention.

Furthermore, each VF should have a separate requester ID.  What's being
suggested here seems like maybe that's not the case.  If true, it would
make iommu groups challenging.  Is there any VF save/restore around the
scheduling?

> Each IOMMU map/unmap should be done in less than 100ns.

I think that may be a lot to ask if we need to unmap the regions in the
guest and in the iommu.  If the "VFs" used different requester IDs,
iommu unmapping whouldn't be necessary.  I experimented with switching
between trapped (read/write) access to memory regions and mmap'd (direct
mapping) for handling legacy interrupts.  There was a noticeable
performance penalty switching per interrupt.

> As the kernel iommu module is being called by the VFIO driver the PF driver
> cannot interface with it.
> 
> Currently the only interface of the VFIO code is for the userland QEMU process
> and I fear that notifying QEMU that it should do the unmap/block would take 
> more
> than 100ns.
> 
> Also blocking the IO access in QEMU under the BQL would freeze QEMU.
> 
> Do you have and idea on how to write this required map and block/unmap 
> feature ?

It seems like there are several options, but I'm doubtful that any of
them will meet 100ns.  If this is completely fake SR-IOV and there's not
a different requester ID per VF, I'd start with seeing if you can even
do the iommu_unmap/iommu_map of the MMIO BARs in under 100ns.  If that's
close to your limit, then your only real option for QEMU is to freeze
it, which still involves getting multiple (maybe many) vCPUs out of VM
mode.  That's not free either.  If by some miracle you have time to
spare, you could remap the regions to trapped mode and let the vCPUs run
while vfio blocks on read/write.

Maybe there's even a question whether mmap'd mode is worthwhile for this
device.  Trapping every read/write is orders of magnitude slower, but
allows you to handle the "wait for VF" on the kernel side.

If you can provide more info on the device design/contraints, maybe we
can come up with better options.  Thanks,

Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]