qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 15/16] spapr_pci: enable basic hotplug operat


From: Michael Roth
Subject: Re: [Qemu-devel] [PATCH v5 15/16] spapr_pci: enable basic hotplug operations
Date: Tue, 24 Feb 2015 23:17:24 -0600
User-agent: alot/0.3.4

Quoting David Gibson (2015-02-24 21:11:29)
> On Mon, Feb 16, 2015 at 08:27:51AM -0600, Michael Roth wrote:
> > This enables hotplug for PHB bridges. Upon hotplug we generate the
> 
> "PCI Host Bridge bridges" :-p
> 
> > OF-nodes required by PAPR specification and IEEE 1275-1994
> > "PCI Bus Binding to Open Firmware" for the device.
> > 
> > We associate the corresponding FDT for these nodes with the DrcEntry
> > corresponding to the slot, which will be fetched via
> > ibm,configure-connector RTAS calls by the guest as described by PAPR
> > specification. The FDT is cleaned up in the case of unplug.
> > 
> > Signed-off-by: Michael Roth <address@hidden>
> > ---
> >  hw/ppc/spapr_pci.c | 391 
> > ++++++++++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 371 insertions(+), 20 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 6a3d917..b9af1cd 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -33,8 +33,11 @@
> >  #include <libfdt.h>
> >  #include "trace.h"
> >  #include "qemu/error-report.h"
> > +#include "qapi/qmp/qerror.h"
> >  
> >  #include "hw/pci/pci_bus.h"
> > +#include "hw/ppc/spapr_drc.h"
> > +#include "sysemu/device_tree.h"
> >  
> >  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
> >  #define RTAS_QUERY_FN           0
> > @@ -47,7 +50,13 @@
> >  #define RTAS_TYPE_MSI           1
> >  #define RTAS_TYPE_MSIX          2
> >  
> > -#include "hw/ppc/spapr_drc.h"
> > +#define _FDT(exp) \
> > +    do { \
> > +        int ret = (exp);                                           \
> > +        if (ret < 0) {                                             \
> > +            return ret;                                            \
> > +        }                                                          \
> > +    } while (0)
> >  
> >  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
> >  {
> > @@ -481,6 +490,359 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, 
> > void *opaque, int devfn)
> >      return &phb->iommu_as;
> >  }
> >  
> > +/* Macros to operate with address in OF binding to PCI */
> > +#define b_x(x, p, l)    (((x) & ((1<<(l))-1)) << (p))
> > +#define b_n(x)          b_x((x), 31, 1) /* 0 if relocatable */
> > +#define b_p(x)          b_x((x), 30, 1) /* 1 if prefetchable */
> > +#define b_t(x)          b_x((x), 29, 1) /* 1 if the address is aliased */
> > +#define b_ss(x)         b_x((x), 24, 2) /* the space code */
> > +#define b_bbbbbbbb(x)   b_x((x), 16, 8) /* bus number */
> > +#define b_ddddd(x)      b_x((x), 11, 5) /* device number */
> > +#define b_fff(x)        b_x((x), 8, 3)  /* function number */
> > +#define b_rrrrrrrr(x)   b_x((x), 0, 8)  /* register number */
> > +/* for 'reg'/'assigned-addresses' OF properties */
> > +#define RESOURCE_CELLS_SIZE 2
> > +#define RESOURCE_CELLS_ADDRESS 3
> > +
> > +typedef struct ResourceFields {
> > +    uint32_t phys_hi;
> > +    uint32_t phys_mid;
> > +    uint32_t phys_lo;
> > +    uint32_t size_hi;
> > +    uint32_t size_lo;
> > +} ResourceFields;
> 
> Hrm, 5*32-bit ints, that probably needs a ((packed)) on at least some
> platforms to be safe.

I seem to remember some rule (C99?) about padding only being used in cases
where an n-byte field doesn't naturally fall on an n-byte boundary, but
it doesn't hurt to be safe/explicit about what we're expecting.

> 
> > +typedef struct ResourceProps {
> > +    union {
> > +        ResourceFields fields[8];
> > +        uint8_t data[sizeof(ResourceFields) * 8];
> > +    } reg;
> > +    union {
> > +        ResourceFields fields[7];
> > +        uint8_t data[sizeof(ResourceFields) * 7];
> > +    } assigned;
> > +    uint32_t reg_len;
> > +    uint32_t assigned_len;
> > +} ResourceProps;
> 
> I'm a bit dubious about this union, rather than just casting when you
> need the bytestring version.

Yah, I may have gotten a bit carried away there...

> 
> > +
> > +/* fill in the 'reg'/'assigned-resources' OF properties for
> > + * a PCI device. 'reg' describes resource requirements for a
> > + * device's IO/MEM regions, 'assigned-addresses' describes the
> > + * actual resource assignments.
> > + *
> > + * the properties are arrays of ('phys-addr', 'size') pairs describing
> > + * the addressable regions of the PCI device, where 'phys-addr' is a
> > + * RESOURCE_CELLS_ADDRESS-tuple of 32-bit integers corresponding to
> > + * (phys.hi, phys.mid, phys.lo), and 'size' is a
> > + * RESOURCE_CELLS_SIZE-tuple corresponding to (size.hi, size.lo).
> > + * phys.hi = 0xYYXXXXZZ, where:
> > + *   0xYY = npt000ss
> > + *          |||   |
> > + *          |||   +-- space code: 1 if IO region, 2 if MEM region
> > + *          ||+------ for non-relocatable IO: 1 if aliased
> > + *          ||        for relocatable IO: 1 if below 64KB
> > + *          ||        for MEM: 1 if below 1MB
> > + *          |+------- 1 if region is prefetchable
> > + *          +-------- 1 if region is non-relocatable
> > + *   0xXXXX = bbbbbbbb dddddfff, encoding bus, slot, and function
> > + *            bits respectively
> > + *   0xZZ = rrrrrrrr, the register number of the BAR corresponding
> > + *          to the region
> > + *
> > + * phys.mid and phys.lo correspond respectively to the hi/lo portions
> > + * of the actual address of the region.
> > + *
> > + * how the phys-addr/size values are used differ slightly between
> > + * 'reg' and 'assigned-addresses' properties. namely, 'reg' has
> > + * an additional description for the config space region of the
> > + * device, and in the case of QEMU has n=0 and phys.mid=phys.lo=0
> > + * to describe the region as relocatable, with an address-mapping
> > + * that corresponds directly to the PHB's address space for the
> > + * resource. 'assigned-addresses' always has n=1 set with an absolute
> > + * address assigned for the resource. in general, 'assigned-addresses'
> > + * won't be populated, since addresses for PCI devices are generally
> > + * unmapped initially and left to the guest to assign.
> > + *
> > + * in accordance with PCI Bus Binding to Open Firmware,
> > + * IEEE Std 1275-1994, section 4.1.1, as implemented by PAPR+ v2.7,
> > + * Appendix C.
> > + */
> > +static void populate_resource_props(PCIDevice *d, ResourceProps *rp)
> > +{
> > +    int bus_num = pci_bus_num(PCI_BUS(qdev_get_parent_bus(DEVICE(d))));
> > +    uint32_t dev_id = (b_bbbbbbbb(bus_num) |
> > +                       b_ddddd(PCI_SLOT(d->devfn)) |
> > +                       b_fff(PCI_FUNC(d->devfn)));
> > +    ResourceFields *reg, *assigned;
> > +    int i, reg_idx = 0, assigned_idx = 0;
> > +
> > +    /* config space region */
> > +    reg = &rp->reg.fields[reg_idx++];
> > +    reg->phys_hi = cpu_to_be32(dev_id);
> > +    reg->phys_mid = 0;
> > +    reg->phys_lo = 0;
> > +    reg->size_hi = 0;
> > +    reg->size_lo = 0;
> > +
> > +    for (i = 0; i < PCI_NUM_REGIONS; i++) {
> > +        if (!d->io_regions[i].size) {
> > +            continue;
> > +        }
> > +
> > +        reg = &rp->reg.fields[reg_idx++];
> > +
> > +        reg->phys_hi = cpu_to_be32(dev_id | b_rrrrrrrr(pci_bar(d, i)));
> > +        if (d->io_regions[i].type & PCI_BASE_ADDRESS_SPACE_IO) {
> > +            reg->phys_hi |= cpu_to_be32(b_ss(1));
> > +        } else {
> > +            reg->phys_hi |= cpu_to_be32(b_ss(2));
> > +        }
> > +        reg->phys_mid = 0;
> > +        reg->phys_lo = 0;
> > +        reg->size_hi = cpu_to_be32(d->io_regions[i].size >> 32);
> > +        reg->size_lo = cpu_to_be32(d->io_regions[i].size);
> > +
> > +        if (d->io_regions[i].addr != PCI_BAR_UNMAPPED) {
> > +            continue;
> 
> This has inverted sense, doesn't it?  If the BAR *is* unmapped you
> want to continue, skipping the assigned-addresses stuff.

Yes, should be inverted :( It ends up not mattering much with the
current userspace code, since we rely on PCI rescan which always
re-assigns BARs, but definitely needs fixing.

> 
> > +        }
> > +
> > +        assigned = &rp->assigned.fields[assigned_idx++];
> > +        assigned->phys_hi = cpu_to_be32(reg->phys_hi | b_n(1));
> > +        assigned->phys_mid = cpu_to_be32(d->io_regions[i].addr >> 32);
> > +        assigned->phys_lo = cpu_to_be32(d->io_regions[i].addr);
> 
> Not sure if I understood the comment above properly; should these
> addresses actually be translated through an mappings above the PHB?
> Not that there are any such translations on sPAPR, but...

The io_regions[i].addr values are relative to the IO/MEM address spaces for
the device, which correspond to IO/MEM windows for the PHB. So I think the
memory API should handle any translation above that...

Or do you mean because of the n=0/relocatable IO regions described by 'reg'?

IIUC, when n=0, reg->phys_{mid,lo} can be used to encode a starting offset,
so a guest can re-assign/re-program an IO region that's already been assigned
and described via the correspond fields in 'assigned-addresses', so long as
uses an addr above reg->phys{mid,lo}.

So, we could use those reg->phys_{mid,lo} values as an alternative to the
PHBs IO/MEM windows (I guess for platforms that don't provide these windows
and just expose the full address space?).

But since we have those windows, we end up not needing this, so we always do
n=0, reg->phys_mid = reg->phys_lo = 0, so the values in
assigned->phys_{mid,lo} end up just being offsets into the IO/MEM windows,
which correspond directly to io_regions[n].addr.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]