qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv2] piix: fix up/down races + document


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCHv2] piix: fix up/down races + document
Date: Fri, 30 Mar 2012 15:19:33 +0300
User-agent: Mutt/1.5.21 (2010-09-15)

On Thu, Mar 29, 2012 at 01:52:31PM -0600, Alex Williamson wrote:
> On Thu, 2012-03-29 at 21:38 +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 29, 2012 at 10:53:38AM -0600, Alex Williamson wrote:
> > > On Thu, 2012-03-29 at 14:51 +0200, Michael S. Tsirkin wrote:
> > > > piix acpi interface suffers from the following 2 issues:
> > > > 
> > > > 1.
> > > > - delete device a
> > > > - quickly add device b in another slot
> > > > 
> > > > if we do this before guest reads the down register,
> > > > the down event is discarded and device will never
> > > > be deleted.
> > > > 
> > > > 2.
> > > > - delete device a
> > > > - quickly reset before guest can respond
> > > > 
> > > > interrupt is reset and guest will never eject the device.
> > > > 
> > > > To fix this, we implement two changes:
> > > > 
> > > > 1. Add two new registers:
> > > > CLR_UP
> > > > CLR_DOWN
> > > > bios will now write to these the value it read from UP/DOWN
> > > > 
> > > > 2. on reset, remove all devices which have DOWN bit set
> > > > 
> > > > For compatibility with old guests, we also clear
> > > > the DOWN bit on write to EJ0 for a device.
> > > > 
> > > > This patch also adds more detailed documentation
> > > > for the ACPI interface, including the semantics
> > > > of both old and new registers.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin <address@hidden>
> > > > ---
> > > > 
> > > > Changes from v1:
> > > > - tweaked a comment about clearing down register on reset
> > > > - documentation update
> > > > 
> > > >  docs/specs/acpi_pci_hotplug.txt |   68 
> > > > +++++++++++++++++++++++++++++++----
> > > >  hw/acpi_piix4.c                 |   75 
> > > > +++++++++++++++++++++++++++++++-------
> > > >  2 files changed, 121 insertions(+), 22 deletions(-)
> > > > 
> > > > diff --git a/docs/specs/acpi_pci_hotplug.txt 
> > > > b/docs/specs/acpi_pci_hotplug.txt
> > > > index f0f74a7..a8398f9 100644
> > > > --- a/docs/specs/acpi_pci_hotplug.txt
> > > > +++ b/docs/specs/acpi_pci_hotplug.txt
> > > > @@ -7,31 +7,83 @@ describes the interface between QEMU and the ACPI 
> > > > BIOS.
> > > >  ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
> > > >  -----------------------------------------
> > > >  
> > > > -Generic ACPI GPE block. Bit 1 (GPE.1) used to notify PCI hotplug/eject
> > > > +Generic ACPI GPE block. Bit 1 (GPE.1) is used to notify PCI 
> > > > hotplug/hotunplug
> > > >  event to ACPI BIOS, via SCI interrupt.
> > > >  
> > > > -PCI slot injection notification pending (IO port 0xae00-0xae03, 4-byte 
> > > > access):
> > > > +Semantics: this event occurs each time a bit in either Pending PCI slot
> > > > +injection notification or Pending PCI slot removal notification 
> > > > registers
> > > > +changes status from 0 to 1.
> > > > +
> > > > +Pending PCI slot injection notification (IO port 0xae00-0xae03, 4-byte 
> > > > access):
> > > >  ---------------------------------------------------------------
> > > > -Slot injection notification pending. One bit per slot.
> > > > +Slot injection notification is pending. One bit per slot.
> > > > +Guests should not write this register, in practice the value
> > > > +written overwrites the register.
> > > 
> > > Sorry if I'm behind on the discussion, but why do we need to define
> > > 0xae0c & 0xae10 below for write-only access when these registers (0xae00
> > > & 0xae04) are effectively read-only (yes, write is there, but it should
> > > *never* be used in the current implementation)?  Thanks,
> > > 
> > > Alex
> > 
> > Because otherwise a new bios will be broken on an old qemu.
> > We should have made the registers read-only in the old qemu
> > then we won't have the problem now.
> 
> If you think there might be rouge bioses out there that ever wrote to
> these registers, but I doubt that's the case.

As I understand it, the issue is not rouge bioses,
it's the old qemus that make these writeable, and we want the new bios
to work there. Therefore new bios should avoid writing these
registers.

> > Yes, it's probably not a big deal, but it's just as easy
> > to create a new register as it is to redefine the old one.
> 
> It could just as easily be called a spec clarification that writes were
> never intended previously.

As I see it, the issue is with old implementations, not the spec.

>  It's easy to add two more registers... oh
> wait, we only have 16bits of ioport space...  Thanks,
> 
> Alex

8 bytes is still cheap ...

> > > >  Read by ACPI BIOS GPE.1 handler to notify OS of injection
> > > >  events.
> > > >  
> > > > -PCI slot removal notification (IO port 0xae04-0xae07, 4-byte access):
> > > > +Semantics: after a slot is populated by hotplug, it is immediately 
> > > > enabled
> > > > +and a bit in this register is set.
> > > > +Reset value: 0
> > > > +
> > > > +Pending PCI slot removal notification (IO port 0xae04-0xae07, 4-byte 
> > > > access):
> > > >  -----------------------------------------------------
> > > > -Slot removal notification pending. One bit per slot.
> > > > +Slot removal notification is pending. One bit per slot.
> > > > +Guests should not write this register, in practice the value
> > > > +written overwrites the register.
> > > >  
> > > >  Read by ACPI BIOS GPE.1 handler to notify OS of removal
> > > >  events.
> > > >  
> > > > +Semantics: upon hotunplug request, a bit in this register is set
> > > > +and the OSPM is notified about hotunplug request, but a slot remains 
> > > > enabled
> > > > +until eject.
> > > > +Reset value: 0
> > > > +
> > > >  PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > >  ----------------------------------------
> > > >  
> > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per 
> > > > slot.
> > > > -Reads return 0.
> > > > +Guests should not read this register. In practice reads return 0.
> > > > +
> > > > +Semantics: selected hotunpluggable slots are disabled and powered off.
> > > > +Has no effect on non-hotunpluggable slots.
> > > > +
> > > > +For compatibility with old BIOS, writes to this register
> > > > +clear bits in pending slot injection notification register.
> > > >  
> > > >  PCI removability status (IO port 0xae0c-0xae0f, 4-byte access):
> > > >  -----------------------------------------------
> > > >  
> > > > -Used by ACPI BIOS _RMV method to indicate removability status to OS. 
> > > > One
> > > > -bit per slot.
> > > > +Used by ACPI BIOS _RMV method to detect removability status
> > > > +and report to OS. One bit per slot.
> > > > +Guests should not write this register, in practice the value
> > > > +written is discarded.
> > > > +
> > > > +Semantics: selected slots support hotplug. Contents of this register
> > > > +do not change while guest is running.
> > > > +
> > > > +Clear PCI slot injection notification (IO port 0xae10-0xae14, 4-byte 
> > > > access):
> > > > +---------------------------------------------------------------
> > > > +Clears bits in pending slot injection notification register. One bit 
> > > > per slot.
> > > > +Guests should not read this register. In practice reads return 0.
> > > > +
> > > > +Written to by ACPI BIOS GPE.1 handler after reading the
> > > > +pending slot injection notification register and before
> > > > +notifying OS of injection events.
> > > > +
> > > > +Semantics: used to acknowledge the injection notification.
> > > > +Does not affect slot status.
> > > > +
> > > > +Clear PCI slot removal notification (IO port 0xae04-0xae07, 4-byte 
> > > > access):
> > > > +-----------------------------------------------------
> > > > +Clears bits in slot removal notification pending register. One bit per 
> > > > slot.
> > > > +Guests should not read this register. In practice reads return 0.
> > > > +
> > > > +Written to by ACPI BIOS GPE.1 handler after reading the
> > > > +pending slot removal notification register and before
> > > > +notifying OS of removal events.
> > > > +
> > > > +Semantics: used to acknowledge the hotunplug request.
> > > > +Does not affect slot status.
> > > > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> > > > index 797ed24..50553fd 100644
> > > > --- a/hw/acpi_piix4.c
> > > > +++ b/hw/acpi_piix4.c
> > > > @@ -43,6 +43,8 @@
> > > >  #define PCI_BASE 0xae00
> > > >  #define PCI_EJ_BASE 0xae08
> > > >  #define PCI_RMV_BASE 0xae0c
> > > > +#define PCI_CLR_UP_BASE 0xae10
> > > > +#define PCI_CLR_DOWN_BASE 0xae14
> > > >  
> > > >  #define PIIX4_PCI_HOTPLUG_STATUS 2
> > > >  
> > > > @@ -287,6 +289,28 @@ static void piix4_update_hotplug(PIIX4PMState *s)
> > > >      }
> > > >  }
> > > >  
> > > > +static void acpi_piix_eject_slot(PIIX4PMState *s, unsigned slots)
> > > > +{
> > > > +    DeviceState *qdev, *next;
> > > > +    BusState *bus = qdev_get_parent_bus(&s->dev.qdev);
> > > > +    int slot = ffs(slots) - 1;
> > > > +
> > > > +    /*
> > > > +     * Clear DOWN register - this is good for a case where guest can't
> > > > +     * write to CLR_DOWN because of VM reset, and for compatibility 
> > > > with
> > > > +     * old guests which do not have CLR_DOWN.
> > > > +     */
> > > > +    s->pci0_status.down &= ~(1U << slot);
> > > > +
> > > > +    QTAILQ_FOREACH_SAFE(qdev, &bus->children, sibling, next) {
> > > > +        PCIDevice *dev = PCI_DEVICE(qdev);
> > > > +        PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
> > > > +        if (PCI_SLOT(dev->devfn) == slot && !pc->no_hotplug) {
> > > > +            qdev_free(qdev);
> > > > +        }
> > > > +    }
> > > > +}
> > > > +
> > > >  static void piix4_reset(void *opaque)
> > > >  {
> > > >      PIIX4PMState *s = opaque;
> > > > @@ -302,6 +326,19 @@ static void piix4_reset(void *opaque)
> > > >          pci_conf[0x5B] = 0x02;
> > > >      }
> > > >      piix4_update_hotplug(s);
> > > > +    /*
> > > > +     * Guest lost remove events if any.
> > > > +     * As it's being reset it's safe to remove the device now.
> > > > +     */
> > > > +    while (s->pci0_status.down) {
> > > > +        acpi_piix_eject_slot(s, s->pci0_status.down);
> > > > +    }
> > > > +    /*
> > > > +     * Guest lost add events if any.
> > > > +     * As it's being reset and will rescan the bus we can discard
> > > > +     * past events now.
> > > > +     */
> > > > +    s->pci0_status.up = 0;
> > > >  }
> > > >  
> > > >  static void piix4_powerdown(void *opaque, int irq, int power_failing)
> > > > @@ -490,22 +527,31 @@ static uint32_t pciej_read(void *opaque, uint32_t 
> > > > addr)
> > > >  
> > > >  static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
> > > >  {
> > > > -    BusState *bus = opaque;
> > > > -    DeviceState *qdev, *next;
> > > > -    int slot = ffs(val) - 1;
> > > > +    PIIX4PMState *s = opaque;
> > > >  
> > > > -    QTAILQ_FOREACH_SAFE(qdev, &bus->children, sibling, next) {
> > > > -        PCIDevice *dev = PCI_DEVICE(qdev);
> > > > -        PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
> > > > -        if (PCI_SLOT(dev->devfn) == slot && !pc->no_hotplug) {
> > > > -            qdev_free(qdev);
> > > > -        }
> > > > +    if (val) {
> > > > +        acpi_piix_eject_slot(s, val);
> > > >      }
> > > >  
> > > > -
> > > >      PIIX4_DPRINTF("pciej write %x <== %d\n", addr, val);
> > > >  }
> > > >  
> > > > +static void pci_clr_up_write(void *opaque, uint32_t addr, uint32_t val)
> > > > +{
> > > > +    PIIX4PMState *s = opaque;
> > > > +    s->pci0_status.up &= ~val;
> > > > +
> > > > +    PIIX4_DPRINTF("pci_clr_up write %x <== %d\n", addr, val);
> > > > +}
> > > > +
> > > > +static void pci_clr_down_write(void *opaque, uint32_t addr, uint32_t 
> > > > val)
> > > > +{
> > > > +    PIIX4PMState *s = opaque;
> > > > +    s->pci0_status.down &= ~val;
> > > > +
> > > > +    PIIX4_DPRINTF("pci_clr_down write %x <== %d\n", addr, val);
> > > > +}
> > > > +
> > > >  static uint32_t pcirmv_read(void *opaque, uint32_t addr)
> > > >  {
> > > >      PIIX4PMState *s = opaque;
> > > > @@ -532,12 +578,15 @@ static void piix4_acpi_system_hot_add_init(PCIBus 
> > > > *bus, PIIX4PMState *s)
> > > >      register_ioport_write(PCI_BASE, 8, 4, pcihotplug_write, 
> > > > pci0_status);
> > > >      register_ioport_read(PCI_BASE, 8, 4,  pcihotplug_read, 
> > > > pci0_status);
> > > >  
> > > > -    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, bus);
> > > > -    register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, bus);
> > > > +    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, s);
> > > > +    register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, s);
> > > >  
> > > >      register_ioport_write(PCI_RMV_BASE, 4, 4, pcirmv_write, s);
> > > >      register_ioport_read(PCI_RMV_BASE, 4, 4,  pcirmv_read, s);
> > > >  
> > > > +    register_ioport_write(PCI_CLR_UP_BASE, 4, 4, pci_clr_up_write, s);
> > > > +    register_ioport_write(PCI_CLR_DOWN_BASE, 4, 4, pci_clr_down_write, 
> > > > s);
> > > > +
> > > >      pci_bus_hotplug(bus, piix4_device_hotplug, &s->dev.qdev);
> > > >  }
> > > >  
> > > > @@ -567,8 +616,6 @@ static int piix4_device_hotplug(DeviceState *qdev, 
> > > > PCIDevice *dev,
> > > >          return 0;
> > > >      }
> > > >  
> > > > -    s->pci0_status.up = 0;
> > > > -    s->pci0_status.down = 0;
> > > >      if (state == PCI_HOTPLUG_ENABLED) {
> > > >          enable_device(s, slot);
> > > >      } else {
> > > 
> > > 
> 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]