qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 3/3] msi: Store the capability size in PCIDevice


From: Michael S. Tsirkin
Subject: [Qemu-devel] Re: [PATCH 3/3] msi: Store the capability size in PCIDevice
Date: Tue, 2 Nov 2010 21:26:08 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Nov 02, 2010 at 10:08:30AM -0600, Alex Williamson wrote:
> On Tue, 2010-11-02 at 17:39 +0200, Michael S. Tsirkin wrote:
> > On Tue, Nov 02, 2010 at 08:23:10AM -0600, Alex Williamson wrote:
> > > On Tue, 2010-11-02 at 16:07 +0200, Michael S. Tsirkin wrote:
> > > > On Tue, Nov 02, 2010 at 08:00:38AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2010-11-02 at 11:25 +0200, Michael S. Tsirkin wrote:
> > > > > > On Mon, Nov 01, 2010 at 11:37:53PM -0600, Alex Williamson wrote:
> > > > > > > Avoid needing to get the MSI capability flags every time we need 
> > > > > > > to
> > > > > > > check the capability length.  This also makes it accessible 
> > > > > > > outside
> > > > > > > of msi.c, making it easier for users to filter config space writes
> > > > > > > using msi_cap and msi_cap_size.
> > > > > > 
> > > > > > I think for this last use-case, we are better off with returning a
> > > > > > boolean from msi_write_config which tells us whether the write is in
> > > > > > range. This has the advantage in that it will also work well for 
> > > > > > other
> > > > > > capabilities. Or second best, if that is insufficient for some 
> > > > > > reason,
> > > > > > export an msi_cap_size function.
> > > > > 
> > > > > Returning whether the write was in range isn't enough.  For device
> > > > > assignment, I need to know whether the capability was enabled or
> > > > > disabled.  This currently means checking the enable state before and
> > > > > after calling msi_write_config and doing the appropriate backend 
> > > > > setup.
> > > > 
> > > > Sounds good. Why does this mean you need the capability size?
> > > >         bool was_enabled = msi_enabled(dev);
> > > >         msi_write_config(..)
> > > >         if (was_enabled != msi_enabled(dev)) {
> > > >         }
> > > 
> > > Because this code makes me sad...
> > > 
> > >    bool msi_was_enabled, msix_was_enabled, msi_is_enabled, 
> > > msix_is_enabled;
> > > 
> > >    msi_was_enabled = msi_enabled(dev);
> > >    msix_was_enabled = msix_enabled(dev);
> > > 
> > >    pci_default_write_config(...
> > >    msi_write_config(...
> > >    msix_write_config(...
> > > 
> > >    msi_is_enabled = msi_enabled(dev);
> > >    msix_is_enabled = msix_enabled(dev);
> > > 
> > >    if (msi_was_enabled && !msi_is_enabled)
> > >        disable_msi(...
> > >    if (!msi_was_enabled && msi_is_enabled)
> > >        enable_msi(...
> > >    if (msix_was_enabled && !msix_is_enabled)
> > >        disable_msi(...
> > >    if (!msix_was_enabled && msix_is_enabled)
> > >        enable_msix(...
> > > 
> > > Confining msi tests to an msi related write and msix tests to an msix
> > > related write makes me slightly happier.  I really think we need
> > > callbacks though so common msi/msix code can figure out if we've made a
> > > transition.
> > > 
> > > Alex
> > 
> > This is what we have in qemu-kvm for vhost now, and the code turned out
> > to be terribly hard to get right.  I would rather not repeat that,
> > and I would love to rip out the callbacks we have now, too.
> > One approach would be to simply fold the handling of irqfds
> > into msix.c.
> 
> What makes it hard to get right?  On one hand, if it is hard to get
> right, that's all the more reason it should be done in common code so we
> don't have to repeat mistakes.

Callbacks are hard to use right.

> > Having said all that, I really don't understand why does VFIO
> > force you to figure out that e.g. msix was enabled/disabled.
> > Can we not get the config write and simply call write() on VFIO?
> > That is an interface that makes sense to me.
> 
> VFIO interrupts are configured via ioctls.  Config space writes to
> msi/msix capabilities are emulated.  IMHO, this works out pretty well,
> but we could easily make use of the QEMU config emulation if VFIO just
> wanted to drop accesses there.  Config space could be used for some
> setup, but we have to setup INTx via ioctl and we'd have to pre-register
> eventfds per vector.  It's just easy and consistent to set them all up
> the same way.

Yea. So I think we should just do whatever is needed on startup: create
eventfds etc. And then during operation, qemu should
simply get called on config/memory/io writes and pass these
on to VFIO. If memory write masks a vector, or switches
from INTx to MSI or whatever, VFIO should be able to
figure it out.

BAR config register writes might be the only exception we might
need in qemu until we have explicit hooks into pci.c.

> > > > > I think the only way I could blindly call the msi/x write config
> > > > > routines is if we init the capability with enable/disable callbacks.
> > > > > I'd be ok with an msi_cap_size function if we don't want to go that 
> > > > > far
> > > > > too.  What do you prefer?  Thanks,
> > > 
> > > 
> 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]