qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH v2 02/19] spapr: introduce a skeleton for the XIVE


From: David Gibson
Subject: Re: [Qemu-ppc] [PATCH v2 02/19] spapr: introduce a skeleton for the XIVE interrupt controller
Date: Thu, 12 Apr 2018 15:07:07 +1000
User-agent: Mutt/1.9.2 (2017-12-15)

On Wed, Dec 20, 2017 at 08:38:41AM +0100, Cédric Le Goater wrote:
> On 12/20/2017 06:09 AM, David Gibson wrote:
> > On Sat, Dec 09, 2017 at 09:43:21AM +0100, Cédric Le Goater wrote:
> >> With the POWER9 processor comes a new interrupt controller called
> >> XIVE. It is composed of three sub-engines :
> >>
> >>   - Interrupt Virtualization Source Engine (IVSE). These are in PHBs,
> >>     in the main controller for the IPIS and in the PSI host
> >>     bridge. They are configured to feed the IVRE with events.
> >>
> >>   - Interrupt Virtualization Routing Engine (IVRE). Their job is to
> >>     match an event source with a Notification Virtualization Target
> >>     (NVT), a priority and an Event Queue (EQ) to determine if a
> >>     Virtual Processor can handle the event.
> >>
> >>   - Interrupt Virtualization Presentation Engine (IVPE). It maintains
> >>     the interrupt state of each hardware thread and present the
> >>     notification as an external exception.
> >>
> >> Each of the engines uses a set of internal tables to redirect
> >> exceptions from event sources to CPU threads. The first table we
> >> introduce is the Interrupt Virtualization Entry (IVE) table, part of
> >> the virtualization engine in charge of routing events. It associates
> >> event sources (IRQ numbers) to event queues which will forward, or
> >> not, the event notification to the presentation controller.
> >>
> >> The XIVE model is designed to make use of the full range of the IRQ
> >> number space and does not use an offset like the XICS mode does.
> >> Hence, the IVE table is directly indexed by the IRQ number.
> >>
> >> Signed-off-by: Cédric Le Goater <address@hidden>
> > 
> > As you've suggested in yourself, I think we might need to more
> > explicitly model the different components of the XIVE system.  As part
> > of that, I think you need to be clearer in this base skeleton about
> > exactly what component your XIVE object represents.

Sorry it's been so long since I looked at these.

> ok. The base skeleton is the IVRE, the central engine handling 
> the routing. 
> 
> > If the answer is "the overall thing" 
> 
> Yes, it is more or less that currently. 
> 
> The sPAPRXive object models the source engine and the routing 
> engine in one object.

Yeah, I suspect we don't want that.  Although it might seem simpler in
the spapr case, at least at first glance, I think it will cause us
problems later.  At the very least, it's likely to make it harder to
share code between the spapr and powernv case.  I think it will also
make for more confusion about exactly what things belong where.

> I have merged these for simplicity and because the interrupt 
> controller has an internal source for the interrupts of the "IPI" 
> type, which are used for the CPU IPIs but also for other generic 
> interrupts, like the OpenCAPI ones. The XIVE sPAPR interface is 
> also much simpler than the baremetal one, all the tables are 
> maintained in the hypervisor, so this choice made some sense. 
> 
> But since, I have started the PowerNV model and I am duplicating 
> a lot of code to handle the triggering and the MMIOs in the 
> different sources. So I am not convinced anymore. Nevertheless, 
> the overall routing logic is the same even if some the tables 
> are not located in QEMU anymore, but in the machine memory.
> 
> The sPAPRXiveNVT models some of the CPU presenter engine. It 
> holds the virtual CPU interrupt states when not dispatched on 
> a real HW thread. Real world is more complex. There are "CAM" 
> lines in the HW threads which are compared to find a matching 
> candidate. But I don't think we need to anything more complex 
> than today unless we want to support KVM under TCG ...
>    
> > I suspect that's not what you
> > want - I had one of those for XICs which proved to be a mistake
> > (eventually replaced by the XICSFabric interface).
> 
> The XICSFabric would be the main Xive object. The interface 
> between the sources and the routing engine is hidden in sPAPR, 
> we can use a simple function call : 
> 
>       spapr_xive_irq(pnv->xive, irq);
> 
> we could get rid of the qirqs but they are required for XICS.

I don't quite follow, but this doesn't sound right.

> PowerNV uses MMIOs to notify an event and it makes the modeling
> somewhat easier. Each controller model has a notify port address 
> register on which a interrupt number is written to forward an 
> event to the routing engine. So it is a simple store. 
> 
> I don't know why there is a different notify port address per
> source, may be for extra filtering at the routing engine level.   
> 
> > Changing the model later isn't impossible, but doing so without
> > breaking migration can be a real pain, so I think it's worth a
> > reasonable effort to try and get it right initially.
> 
> I completely agree. 
> 
> This is why I have started the PnvXive model to challenge the 
> current PAPR design. I have hacked a bunch of patches for XIVE, 
> LPC, PSI, OCC and basic PPC support which boot a PowerNV P9 up to 
> petitboot. It would look better with a source object, but the 
> location of the PQ bits is a bit problematic. It highly depends 
> on the controller. The main controller uses tables in the hypervisor
> memory. The PSIHB controller has its own bits. I suppose it is 
> the same for PHB4. I need to take a closer look at how we could
> have a common source object.

Ok, sounds like a good idea.

> 
> The most important part is KVM support and how we expose the 
> MMIO region. We need to make progress on that topic.
> 
> Thanks,
> 
> C.  
>  
> 
> >> ---
> >>
> >>  Changes since v1 :
> >>
> >>  - used g_new0 instead of g_malloc0
> >>  - removed VMSTATE_STRUCT_VARRAY_UINT32_ALLOC 
> >>  - introduced a device reset handler. the object needs to be parented
> >>    to sysbus when created.
> >>  - renamed spapr_xive_irq_set to spapr_xive_irq_enable
> >>  - renamed spapr_xive_irq_unset to spapr_xive_irq_disable
> >>  - moved the PPC_BIT macros under target/ppc/cpu.h
> >>  - shrinked file copyright header
> >>
> >>  default-configs/ppc64-softmmu.mak |   1 +
> >>  hw/intc/Makefile.objs             |   1 +
> >>  hw/intc/spapr_xive.c              | 156 
> >> ++++++++++++++++++++++++++++++++++++++
> >>  hw/intc/xive-internal.h           |  41 ++++++++++
> >>  include/hw/ppc/spapr_xive.h       |  35 +++++++++
> >>  5 files changed, 234 insertions(+)
> >>  create mode 100644 hw/intc/spapr_xive.c
> >>  create mode 100644 hw/intc/xive-internal.h
> >>  create mode 100644 include/hw/ppc/spapr_xive.h
> >>
> >> diff --git a/default-configs/ppc64-softmmu.mak 
> >> b/default-configs/ppc64-softmmu.mak
> >> index d1b3a6dd50f8..4a7f6a0696de 100644
> >> --- a/default-configs/ppc64-softmmu.mak
> >> +++ b/default-configs/ppc64-softmmu.mak
> >> @@ -56,6 +56,7 @@ CONFIG_SM501=y
> >>  CONFIG_XICS=$(CONFIG_PSERIES)
> >>  CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
> >>  CONFIG_XICS_KVM=$(call land,$(CONFIG_PSERIES),$(CONFIG_KVM))
> >> +CONFIG_XIVE_SPAPR=$(CONFIG_PSERIES)
> >>  # For PReP
> >>  CONFIG_SERIAL_ISA=y
> >>  CONFIG_MC146818RTC=y
> >> diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
> >> index ae358569a155..49e13e7aeeee 100644
> >> --- a/hw/intc/Makefile.objs
> >> +++ b/hw/intc/Makefile.objs
> >> @@ -35,6 +35,7 @@ obj-$(CONFIG_SH4) += sh_intc.o
> >>  obj-$(CONFIG_XICS) += xics.o
> >>  obj-$(CONFIG_XICS_SPAPR) += xics_spapr.o
> >>  obj-$(CONFIG_XICS_KVM) += xics_kvm.o
> >> +obj-$(CONFIG_XIVE_SPAPR) += spapr_xive.o
> >>  obj-$(CONFIG_POWERNV) += xics_pnv.o
> >>  obj-$(CONFIG_ALLWINNER_A10_PIC) += allwinner-a10-pic.o
> >>  obj-$(CONFIG_S390_FLIC) += s390_flic.o
> >> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >> new file mode 100644
> >> index 000000000000..e6e8841add17
> >> --- /dev/null
> >> +++ b/hw/intc/spapr_xive.c
> >> @@ -0,0 +1,156 @@
> >> +/*
> >> + * QEMU PowerPC sPAPR XIVE interrupt controller model
> >> + *
> >> + * Copyright (c) 2017, IBM Corporation.
> >> + *
> >> + * This code is licensed under the GPL version 2 or later. See the
> >> + * COPYING file in the top-level directory.
> >> + */
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "qemu/log.h"
> >> +#include "qapi/error.h"
> >> +#include "target/ppc/cpu.h"
> >> +#include "sysemu/cpus.h"
> >> +#include "sysemu/dma.h"
> >> +#include "monitor/monitor.h"
> >> +#include "hw/ppc/spapr_xive.h"
> >> +
> >> +#include "xive-internal.h"
> >> +
> >> +/*
> >> + * Main XIVE object
> >> + */
> >> +
> >> +void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon)
> >> +{
> >> +    int i;
> >> +
> >> +    for (i = 0; i < xive->nr_irqs; i++) {
> >> +        XiveIVE *ive = &xive->ivt[i];
> >> +
> >> +        if (!(ive->w & IVE_VALID)) {
> >> +            continue;
> >> +        }
> >> +
> >> +        monitor_printf(mon, "  %4x %s %08x %08x\n", i,
> >> +                       ive->w & IVE_MASKED ? "M" : " ",
> >> +                       (int) GETFIELD(IVE_EQ_INDEX, ive->w),
> >> +                       (int) GETFIELD(IVE_EQ_DATA, ive->w));
> >> +    }
> >> +}
> >> +
> >> +static void spapr_xive_reset(DeviceState *dev)
> >> +{
> >> +    sPAPRXive *xive = SPAPR_XIVE(dev);
> >> +    int i;
> >> +
> >> +    /* Mask all valid IVEs in the IRQ number space. */
> >> +    for (i = 0; i < xive->nr_irqs; i++) {
> >> +        XiveIVE *ive = &xive->ivt[i];
> >> +        if (ive->w & IVE_VALID) {
> >> +            ive->w |= IVE_MASKED;
> >> +        }
> >> +    }
> >> +}
> >> +
> >> +static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >> +{
> >> +    sPAPRXive *xive = SPAPR_XIVE(dev);
> >> +
> >> +    if (!xive->nr_irqs) {
> >> +        error_setg(errp, "Number of interrupt needs to be greater 0");
> >> +        return;
> >> +    }
> >> +
> >> +    /* Allocate the IVT (Interrupt Virtualization Table) */
> >> +    xive->ivt = g_new0(XiveIVE, xive->nr_irqs);
> >> +}
> >> +
> >> +static const VMStateDescription vmstate_spapr_xive_ive = {
> >> +    .name = TYPE_SPAPR_XIVE "/ive",
> >> +    .version_id = 1,
> >> +    .minimum_version_id = 1,
> >> +    .fields = (VMStateField []) {
> >> +        VMSTATE_UINT64(w, XiveIVE),
> >> +        VMSTATE_END_OF_LIST()
> >> +    },
> >> +};
> >> +
> >> +static bool vmstate_spapr_xive_needed(void *opaque)
> >> +{
> >> +    /* TODO check machine XIVE support */
> >> +    return true;
> >> +}
> >> +
> >> +static const VMStateDescription vmstate_spapr_xive = {
> >> +    .name = TYPE_SPAPR_XIVE,
> >> +    .version_id = 1,
> >> +    .minimum_version_id = 1,
> >> +    .needed = vmstate_spapr_xive_needed,
> >> +    .fields = (VMStateField[]) {
> >> +        VMSTATE_UINT32_EQUAL(nr_irqs, sPAPRXive, NULL),
> >> +        VMSTATE_STRUCT_VARRAY_UINT32(ivt, sPAPRXive, nr_irqs, 1,
> >> +                                     vmstate_spapr_xive_ive, XiveIVE),
> >> +        VMSTATE_END_OF_LIST()
> >> +    },
> >> +};
> >> +
> >> +static Property spapr_xive_properties[] = {
> >> +    DEFINE_PROP_UINT32("nr-irqs", sPAPRXive, nr_irqs, 0),
> >> +    DEFINE_PROP_END_OF_LIST(),
> >> +};
> >> +
> >> +static void spapr_xive_class_init(ObjectClass *klass, void *data)
> >> +{
> >> +    DeviceClass *dc = DEVICE_CLASS(klass);
> >> +
> >> +    dc->realize = spapr_xive_realize;
> >> +    dc->reset = spapr_xive_reset;
> >> +    dc->props = spapr_xive_properties;
> >> +    dc->desc = "sPAPR XIVE interrupt controller";
> >> +    dc->vmsd = &vmstate_spapr_xive;
> >> +}
> >> +
> >> +static const TypeInfo spapr_xive_info = {
> >> +    .name = TYPE_SPAPR_XIVE,
> >> +    .parent = TYPE_SYS_BUS_DEVICE,
> >> +    .instance_size = sizeof(sPAPRXive),
> >> +    .class_init = spapr_xive_class_init,
> >> +};
> >> +
> >> +static void spapr_xive_register_types(void)
> >> +{
> >> +    type_register_static(&spapr_xive_info);
> >> +}
> >> +
> >> +type_init(spapr_xive_register_types)
> >> +
> >> +XiveIVE *spapr_xive_get_ive(sPAPRXive *xive, uint32_t lisn)
> >> +{
> >> +    return lisn < xive->nr_irqs ? &xive->ivt[lisn] : NULL;
> >> +}
> >> +
> >> +bool spapr_xive_irq_enable(sPAPRXive *xive, uint32_t lisn)
> >> +{
> >> +    XiveIVE *ive = spapr_xive_get_ive(xive, lisn);
> >> +
> >> +    if (!ive) {
> >> +        return false;
> >> +    }
> >> +
> >> +    ive->w |= IVE_VALID;
> >> +    return true;
> >> +}
> >> +
> >> +bool spapr_xive_irq_disable(sPAPRXive *xive, uint32_t lisn)
> >> +{
> >> +    XiveIVE *ive = spapr_xive_get_ive(xive, lisn);
> >> +
> >> +    if (!ive) {
> >> +        return false;
> >> +    }
> >> +
> >> +    ive->w &= ~IVE_VALID;
> >> +    return true;
> >> +}
> >> diff --git a/hw/intc/xive-internal.h b/hw/intc/xive-internal.h
> >> new file mode 100644
> >> index 000000000000..132b71a6daf0
> >> --- /dev/null
> >> +++ b/hw/intc/xive-internal.h
> >> @@ -0,0 +1,41 @@
> >> +/*
> >> + * QEMU PowerPC XIVE interrupt controller model
> >> + *
> >> + * Copyright (c) 2016-2017, IBM Corporation.
> >> + *
> >> + * This code is licensed under the GPL version 2 or later. See the
> >> + * COPYING file in the top-level directory.
> >> + */
> >> +
> >> +#ifndef _INTC_XIVE_INTERNAL_H
> >> +#define _INTC_XIVE_INTERNAL_H
> >> +
> >> +/* Utilities to manipulate these (originaly from OPAL) */
> >> +#define MASK_TO_LSH(m)          (__builtin_ffsl(m) - 1)
> >> +#define GETFIELD(m, v)          (((v) & (m)) >> MASK_TO_LSH(m))
> >> +#define SETFIELD(m, v, val)                             \
> >> +        (((v) & ~(m)) | ((((typeof(v))(val)) << MASK_TO_LSH(m)) & (m)))
> >> +
> >> +/* IVE/EAS
> >> + *
> >> + * One per interrupt source. Targets that interrupt to a given EQ
> >> + * and provides the corresponding logical interrupt number (EQ data)
> >> + *
> >> + * We also map this structure to the escalation descriptor inside
> >> + * an EQ, though in that case the valid and masked bits are not used.
> >> + */
> >> +typedef struct XiveIVE {
> >> +        /* Use a single 64-bit definition to make it easier to
> >> +         * perform atomic updates
> >> +         */
> >> +        uint64_t        w;
> >> +#define IVE_VALID       PPC_BIT(0)
> >> +#define IVE_EQ_BLOCK    PPC_BITMASK(4, 7)        /* Destination EQ block# 
> >> */
> >> +#define IVE_EQ_INDEX    PPC_BITMASK(8, 31)       /* Destination EQ index 
> >> */
> >> +#define IVE_MASKED      PPC_BIT(32)              /* Masked */
> >> +#define IVE_EQ_DATA     PPC_BITMASK(33, 63)      /* Data written to the 
> >> EQ */
> >> +} XiveIVE;
> >> +
> >> +XiveIVE *spapr_xive_get_ive(sPAPRXive *xive, uint32_t lisn);
> >> +
> >> +#endif /* _INTC_XIVE_INTERNAL_H */
> >> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> >> new file mode 100644
> >> index 000000000000..5b1f78e06a1e
> >> --- /dev/null
> >> +++ b/include/hw/ppc/spapr_xive.h
> >> @@ -0,0 +1,35 @@
> >> +/*
> >> + * QEMU PowerPC sPAPR XIVE interrupt controller model
> >> + *
> >> + * Copyright (c) 2017, IBM Corporation.
> >> + *
> >> + * This code is licensed under the GPL version 2 or later. See the
> >> + * COPYING file in the top-level directory.
> >> + */
> >> +
> >> +#ifndef PPC_SPAPR_XIVE_H
> >> +#define PPC_SPAPR_XIVE_H
> >> +
> >> +#include <hw/sysbus.h>
> >> +
> >> +typedef struct sPAPRXive sPAPRXive;
> >> +typedef struct XiveIVE XiveIVE;
> >> +
> >> +#define TYPE_SPAPR_XIVE "spapr-xive"
> >> +#define SPAPR_XIVE(obj) OBJECT_CHECK(sPAPRXive, (obj), TYPE_SPAPR_XIVE)
> >> +
> >> +struct sPAPRXive {
> >> +    SysBusDevice parent;
> >> +
> >> +    /* Properties */
> >> +    uint32_t     nr_irqs;
> >> +
> >> +    /* XIVE internal tables */
> >> +    XiveIVE      *ivt;
> >> +};
> >> +
> >> +bool spapr_xive_irq_enable(sPAPRXive *xive, uint32_t lisn);
> >> +bool spapr_xive_irq_disable(sPAPRXive *xive, uint32_t lisn);
> >> +void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
> >> +
> >> +#endif /* PPC_SPAPR_XIVE_H */
> > 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]