qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Questions about VFIO enabling MSIX vector


From: Alex Williamson
Subject: Re: [Qemu-devel] Questions about VFIO enabling MSIX vector
Date: Tue, 15 Jan 2019 09:18:17 -0700

On Tue, 15 Jan 2019 11:21:51 +0800
Heyi Guo <address@hidden> wrote:

> Hi Alex,
> 
> Really appreciate your comments. I have some more questions below.
> 
> 
> On 2019/1/15 0:07, Alex Williamson wrote:
> > On Sat, 12 Jan 2019 10:30:40 +0800
> > Heyi Guo <address@hidden> wrote:
> >  
> >> Hi folks,
> >>
> >> I have some questions about vfio_msix_vector_do_use() in
> >> hw/vfio/pci.c, could you help to explain?
> >>
> >> We can see that when guest tries to enable one specific MSIX vector
> >> by unmasking MSIX Vector Control, the access will be trapped and then
> >> into function vfio_msix_vector_do_use(). And we may go to the below
> >> branch in line 525:
> >>
> >>
> >> 520 /*
> >> 521  * We don't want to have the host allocate all possible MSI vectors
> >> 522  * for a device if they're not in use, so we shutdown and incrementally
> >> 523  * increase them as needed.
> >> 524  */
> >> 525  if (vdev->nr_vectors < nr + 1) {
> >> 526      vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
> >> 527      vdev->nr_vectors = nr + 1;
> >> 528      ret = vfio_enable_vectors(vdev, true);
> >> 529      if (ret) {
> >> 530          error_report("vfio: failed to enable vectors, %d", ret);
> >> 531      }
> >>
> >> Here all MSIX vectors will be disabled first and then enabled, with
> >> one more MSIX. The comment is there but I still don't quite
> >> understand. It makes sense for not allocating all possible MSI
> >> vectors, but why shall we shutdown the whole MSI when being requested
> >> to enable one specific vector? Can't we just enable the user
> >> specified vector indexed by "nr"?  
> >
> > What internal kernel API would vfio-pci make use of to achieve that?
> > We can't know the intentions of the guest and we have a limited set of
> > tools to manage the MSI-X state in the host kernel.  It's generally the
> > case that MSI-X vectors are only manipulated during driver setup and
> > shutdown, so while the current behavior isn't necessarily efficient, it
> > works within the constraints and generally doesn't have a runtime impact
> > on performance.  We also recognize that this behavior may change in the
> > future depending on more dynamic internal host kernel interfaces, thus
> > we export the VFIO_IRQ_INFO_NORESIZE flag to indicate to userspace
> > whether this procedure is required.  
> 
> I tried to enable the specified MSIX vector only, and finally
> understood the original QEMU code after always getting EINVAL from
> ioctl->VFIO_DEVICE_SET_IRQS. Then I tried to allocate all MSIX
> vectors in vfio_msix_enable(), not only MSIX vector 0. The change
> works, but it definitely doesn't follow the comment in the code "We
> don't want to have the host allocate all possible MSI vectors for a
> device if they're not in use". May I ask what the side effect is if
> we really allocate all possible vectors at the beginning?

Host IRQ vector exhaustion.  Very few devices seem to make use of the
their full compliment of available vectors and some devices support a
very, very large number of vectors.  The current approach only consumes
the host resources that the guest is actually making use of.

> >> What's more, on ARM64 systems with GIC ITS, the kernel will issue
> >> an ITS discard command when disabling a MSI vector, which will drop
> >> currently pending MSI interrupt. If device driver in guest system
> >> enables some MSIs first and interrupts may come at any time, and
> >> then it tries to enable other MSIs, is it possible for the above
> >> code to cause interrupts missing?  
> >
> > Interrupt reconfiguration is generally during driver setup or
> > teardown when the device is considered idle or lost interrupts are
> > not a concern.  If you have a device which manipulates interrupt
> > configuration runtime, you can expect that lost interrupts won't be
> > the only problem,  
>
> On physical machine, if the driver enables vector 0 first, and then
> vector 1 later, will the vector 0 interrupt be lost for some
> possibility? In my opinion the manipulation of vector 1 should not
> affect the interrupt of vector 0, isn't it? But if this is possible
> in virtual world, what do we consider it as? A bug to be fixed in
> future, or a known limitation of virtualization that won't be fixed
> and all driver developers should pay attention to it?

Are you experiencing an actual problem with lost interrupts or is this
a theoretical question?  There is a difference in programming of MSI-X
for an assigned device because we cannot know the intentions of the
guest with respect to how many vectors they intend to use.  We can only
know as individual vectors are unmasked that a vector is now used.
Allocating the maximum compliment of interrupts supported on the device
is wasteful of host resources.  Given this, combined with the fact that
drivers generally configure vectors only at device setup time, the
existing code generally works well.  Ideally drivers should not be
required to modify their behavior to support device assignment, but
certain actions, like high frequency masking/unmasking of interrupts
through the vector table impose unavoidable virtualization overhead
and should be avoided.  I believe older Linux guests (RHEL5 time frame)
had such behavior.

It's possible that current IRQ domains alleviate some of the concern
for host IRQ exhaustion and that vfio-pci could be made to allocate the
maximum supported vectors per device, removing the use of the NORESIZE
flag for the index.  Userspace would need to continue to support both
modes of operation and the kernel would need to continue to be
compatible with that.  Thanks,

Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]