qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 05/19] ppc/pnv: add XIVE support


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH 05/19] ppc/pnv: add XIVE support
Date: Thu, 21 Feb 2019 14:13:16 +1100
User-agent: Mutt/1.10.1 (2018-07-13)

On Tue, Feb 19, 2019 at 08:31:25AM +0100, Cédric Le Goater wrote:
> On 2/12/19 6:40 AM, David Gibson wrote:
> > On Mon, Jan 28, 2019 at 10:46:11AM +0100, Cédric Le Goater wrote:
[snip]
> >>  #endif /* _PPC_PNV_H */
> >> diff --git a/include/hw/ppc/pnv_core.h b/include/hw/ppc/pnv_core.h
> >> index 9961ea3a92cd..8e57c064e661 100644
> >> --- a/include/hw/ppc/pnv_core.h
> >> +++ b/include/hw/ppc/pnv_core.h
> >> @@ -49,6 +49,7 @@ typedef struct PnvCoreClass {
> >>  
> >>  typedef struct PnvCPUState {
> >>      struct ICPState *icp;
> >> +    struct XiveTCTX *tctx;
> > 
> > Unlike sPAPR, we really do always know in advance the interrupt
> > controller for a particular machine.  I think it makes sense to
> > further split the POWER8 and POWER9 cases here, so we only track one
> > for any given setup.
> 
> So, you would define a : 
> 
>   typedef struct Pnv9CPUState {
>       struct XiveTCTX *tctx;
>   } Pnv9CPUState;
> 
> to be allocated when the core is realized ? and later the routine 
> pnv_chip_power9_intc_create() would assign the ->tctx pointer.

Sounds about right.

[snip]
> >> +    uint32_t      nr_ends;
> >> +    XiveENDSource end_source;
> >> +
> >> +    /* Interrupt controller registers */
> >> +    uint64_t      regs[0x300];
> >> +
> >> +    /* Can be configured by FW */
> >> +    uint32_t      tctx_chipid;
> >> +    uint32_t      chip_id;
> > 
> > Can't you derive that since you have a pointer to the owning chip?
> 
> Not always, there are register fields to purposely override this value.
> I can improve the current code a little I think.

Ok.

[snip]
> >> +/*
> >> + * Virtual structures table (VST)
> >> + */
> >> +typedef struct XiveVstInfo {
> >> +    uint32_t    type;
> >> +    const char *name;
> >> +    uint32_t    size;
> >> +    uint32_t    max_blocks;
> >> +} XiveVstInfo;
> >> +
> >> +static const XiveVstInfo vst_infos[] = {
> >> +    [VST_TSEL_IVT]  = { VST_TSEL_IVT,  "EAT",  sizeof(XiveEAS), 16 },
> > 
> > I don't love explicitly storing the type/index in each record, as well
> > as it being implicit in the table slot.
> 
> The 'vst_infos' table decribes the different table types and the 'type' 
> field is used to index the runtime table of VSDs. See
> pnv_xive_vst_addr()

Yes, I know what it's for but it's still redundant information.  You
could avoid it, for example, by passing around an index instead of a
pointer to a vst_infos[] slot - then you can look up both vst_infos
and the other table using that index.

[snip]
> >> +    case CQ_TM1_BAR: /* TM BAR. 4 pages. Map only once */
> >> +    case CQ_TM2_BAR: /* second TM BAR. for hotplug. Not modeled */
> >> +        xive->tm_shift = val & CQ_TM_BAR_64K ? 16 : 12;
> >> +        if (!(val & CQ_TM_BAR_VALID)) {
> >> +            xive->tm_base = 0;
> >> +            if (xive->regs[reg] & CQ_TM_BAR_VALID && xive->chip_id == 0) {
> >> +                memory_region_del_subregion(sysmem, &xive->tm_mmio);
> >> +            }
> >> +        } else {
> >> +            xive->tm_base = val & ~(CQ_TM_BAR_VALID | CQ_TM_BAR_64K);
> >> +            if (!(xive->regs[reg] & CQ_TM_BAR_VALID) && xive->chip_id == 
> >> 0) {
> >> +                memory_region_add_subregion(sysmem, xive->tm_base,
> >> +                                            &xive->tm_mmio);
> >> +            }
> >> +        }
> >> +        break;
> >> +
> >> +    case CQ_PC_BARM:
> >> +        xive->regs[reg] = val;
> > 
> > As discussed elsewhere, this seems to be a big mix of writing things
> > directly into regs[reg] and doing other things instead, and you really
> > want to go one way or the other.  I'd suggest dropping xive->regs[]
> > and instead putting the state you need persistent into its own
> > variables.
> 
> I made a big effort to introduce helper routines to avoid storing values 
> that can be calculated under the PnvXive model, as you asked for it. 
> The assignment above is only necessary for the pnv_xive_pc_size() below
> and I don't know how handle this current case without duplicating the 
> switch statement, which I think is ugly.

I'm not sure quite what you mean about duplicating the case.

The point here is that since you're only storing in a couple of the
switch cases, you can just have explicit data backing just those
values and write to those in the switch cases instead of having a
great big regs[] array of which only a few bits are used.

> So, I will keep the xive->regs[] and make the couple of fixes still needed.

[snip]
> >> +/*
> >> + * Virtualization Controller MMIO region containing the IPI and END ESB 
> >> pages
> >> + */
> >> +static uint64_t pnv_xive_vc_read(void *opaque, hwaddr offset,
> >> +                                 unsigned size)
> >> +{
> >> +    PnvXive *xive = PNV_XIVE(opaque);
> >> +    uint64_t edt_index = offset >> pnv_xive_edt_shift(xive);
> >> +    uint64_t edt_type = 0;
> >> +    uint64_t edt_offset;
> >> +    MemTxResult result;
> >> +    AddressSpace *edt_as = NULL;
> >> +    uint64_t ret = -1;
> >> +
> >> +    if (edt_index < XIVE_TABLE_EDT_MAX) {
> >> +        edt_type = GETFIELD(CQ_TDR_EDT_TYPE, xive->edt[edt_index]);
> >> +    }
> >> +
> >> +    switch (edt_type) {
> >> +    case CQ_TDR_EDT_IPI:
> >> +        edt_as = &xive->ipi_as;
> >> +        break;
> >> +    case CQ_TDR_EDT_EQ:
> >> +        edt_as = &xive->end_as;
> > 
> > I'm not entirely understanding the function of these AddressSpace objectz.
> 
> The IPI and END ESB pages are exposed in the same VC region. But this VC 
> region is not splitted between the two types with a single offset. It 
> is segmented with the EDT tables which defines the type of each segment. 
> 
> The purpose of these address spaces is to translate the load/stores in 
> the VC region in the equivalent IPI or END region exposing contigously 
> ESB pages of the same type: 'end_edt_mmio' and 'ipi_edt_mmio'. 
> 
> The memory regions of the XiveENDSource and the XiveSource for the IPIs 
> are mapped in 'end_edt_mmio' and 'ipi_edt_mmio'.

Hmm.  Ok, I'm not immediately seeing why you can't map the various IPI
or END blocks directly into address_space memory rather than having
this extra internal layer of indirection.

> (This is changing for P10, should be simpler)   

[snip]
> >> +/*
> >> + *  Maximum number of IRQs and ENDs supported by HW
> >> + */
> >> +#define PNV_XIVE_NR_IRQS (PNV9_XIVE_VC_SIZE / (1ull << 
> >> XIVE_ESB_64K_2PAGE))
> >> +#define PNV_XIVE_NR_ENDS (PNV9_XIVE_VC_SIZE / (1ull << 
> >> XIVE_ESB_64K_2PAGE))
> >> +
> >> +static void pnv_xive_realize(DeviceState *dev, Error **errp)
> >> +{
> >> +    PnvXive *xive = PNV_XIVE(dev);
> >> +    XiveSource *xsrc = &xive->source;
> >> +    XiveENDSource *end_xsrc = &xive->end_source;
> >> +    Error *local_err = NULL;
> >> +    Object *obj;
> >> +
> >> +    obj = object_property_get_link(OBJECT(dev), "chip", &local_err);
> >> +    if (!obj) {
> >> +        error_propagate(errp, local_err);
> >> +        error_prepend(errp, "required link 'chip' not found: ");
> >> +        return;
> >> +    }
> >> +
> >> +    /* The PnvChip id identifies the XIVE interrupt controller. */
> >> +    xive->chip = PNV_CHIP(obj);
> >> +
> >> +    /*
> >> +     * Xive Interrupt source and END source object with the maximum
> >> +     * allowed HW configuration. The ESB MMIO regions will be resized
> >> +     * dynamically when the controller is configured by the FW to
> >> +     * limit accesses to resources not provisioned.
> >> +     */
> >> +    object_property_set_int(OBJECT(xsrc), PNV_XIVE_NR_IRQS, "nr-irqs",
> >> +                            &error_fatal);
> > 
> > You have a constant here, but your router object also includes a
> > nr_irqs field.  What's up with that?
> 
> The XiveSource object of PnvXive is realized for the maximum size allowed
> by HW because we don't know how much IRQs the FW will provision for.
> 
> The 'nr_irqs' is the number of IRQs configured by the FW (We changed the
> name to 'nr_ipis') See routine pnv_xive_vst_set_exclusive()

Ah, ok.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]