qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/5] PCI: Implement basic PCI PM capability backing


From: Michael S. Tsirkin
Subject: Re: [PATCH v2 0/5] PCI: Implement basic PCI PM capability backing
Date: Tue, 4 Mar 2025 12:14:02 -0500

On Tue, Mar 04, 2025 at 01:18:45PM +0100, Cédric Le Goater wrote:
> Hello Michael,
> 
> Could you please re-ack (or not) v2 ?
> 
> Thanks
> 
> C.


pci things:
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>


> On 2/25/25 22:52, Alex Williamson wrote:
> > v2:
> > 
> > Eric noted in v1 that one of the drivers had a redundant wmask setting
> > since pci_pm_init() enabled writes to the power state field.  This was
> > added because vfio-pci was not setting wmask for this capability but
> > is allowing writes to the PM state field through to the device.  For
> > vfio-pci, QEMU emulated config space is rather secondary to the config
> > space through vfio.
> > 
> > It turns out therefore, that vfio-pci is nearly unique in not already
> > managing the wmask of the PM capability state and if we embrace that
> > it's the pci_pm_init() caller's responsibility to manage the remaining
> > contents and write-access of the capability, then I think we also
> > solve the question of migration compatibility.  The new infrastructure
> > here is not changing whether any fields were previously writable, it's
> > only effecting a mapping change based on the value found there.
> > 
> > This requires only a slight change to patch 1/, removing setting of
> > the wmask, but commit log is also updated and comments added.  I also
> > made the bad transition trace a little more obvious given Eric's
> > comments.  Patch 2/ is also updated so that vfio-pci effects the wmask
> > change locally.  The couple drivers that don't currently update wmask
> > simply don't get this new BAR unmapped when not in D0 behavior.
> > 
> > Incorporated reviews for the unmodified patches.  Please re-review and
> > report any noted issues.  Thanks,
> > 
> > Alex
> > 
> > v1:
> > 
> > https://lore.kernel.org/all/20250220224918.2520417-1-alex.williamson@redhat.com/
> > 
> > Eric recently identified an issue[1] where during graceful shutdown
> > of a VM in a vIOMMU configuration, the guest driver places the device
> > into the D3 power state, the vIOMMU is then disabled, triggering an
> > AddressSpace update.  The device BARs are still mapped into the AS,
> > but the vfio host driver refuses to DMA map the MMIO space due to the
> > device power state.
> > 
> > The proposed solution in [1] was to skip mappings based on the
> > device power state.  Here we take a different approach.  The PCI spec
> > defines that devices in D1/2/3 power state should respond only to
> > configuration and message requests and all other requests should be
> > handled as an Unsupported Request.  In other words, the memory and
> > IO BARs are not accessible except when the device is in the D0 power
> > state.
> > 
> > To emulate this behavior, we can factor the device power state into
> > the mapping state of the device BARs.  Therefore the BAR is marked
> > as unmapped if either the respective command register enable bit is
> > clear or the device is not in the D0 power state.
> > 
> > In order to implement this, the PowerState field of the PMCSR
> > register becomes writable, which allows the device to appear in
> > lower power states.  This also therefore implements D3 support
> > (insofar as the BAR behavior) for all devices implementing the PM
> > capability.  The PCI spec requires D3 support.
> > 
> > An aspect that needs attention here is whether this change in the
> > wmask and PMCSR bits becomes a problem for migration, and how we
> > might solve it.  For a guest migrating old->new, the device would
> > always be in the D0 power state, but the register becomes writable.
> > In the opposite direction, is it possible that a device could
> > migrate in a low power state and be stuck there since the bits are
> > read-only in old QEMU?  Do we need an option for this behavior and a
> > machine state bump, or are there alternatives?
> > 
> > Thanks,
> > Alex
> > 
> > [1]https://lore.kernel.org/all/20250219175941.135390-1-eric.auger@redhat.com/
> > 
> > 
> > Alex Williamson (5):
> >    hw/pci: Basic support for PCI power management
> >    pci: Use PCI PM capability initializer
> >    vfio/pci: Delete local pm_cap
> >    pcie, virtio: Remove redundant pm_cap
> >    hw/vfio/pci: Re-order pre-reset
> > 
> >   hw/net/e1000e.c                 |  3 +-
> >   hw/net/eepro100.c               |  4 +-
> >   hw/net/igb.c                    |  3 +-
> >   hw/nvme/ctrl.c                  |  3 +-
> >   hw/pci-bridge/pcie_pci_bridge.c |  3 +-
> >   hw/pci/pci.c                    | 93 ++++++++++++++++++++++++++++++++-
> >   hw/pci/trace-events             |  2 +
> >   hw/vfio/pci.c                   | 34 ++++++------
> >   hw/vfio/pci.h                   |  1 -
> >   hw/virtio/virtio-pci.c          | 11 ++--
> >   include/hw/pci/pci.h            |  3 ++
> >   include/hw/pci/pci_device.h     |  3 ++
> >   include/hw/pci/pcie.h           |  2 -
> >   13 files changed, 127 insertions(+), 38 deletions(-)
> > 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]