[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [PATCH v2 03/19] spapr: introduce the XIVE interrupt sour
From: |
Cédric Le Goater |
Subject: |
Re: [Qemu-ppc] [PATCH v2 03/19] spapr: introduce the XIVE interrupt sources |
Date: |
Wed, 20 Dec 2017 08:54:24 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 |
On 12/20/2017 06:22 AM, David Gibson wrote:
> On Sat, Dec 09, 2017 at 09:43:22AM +0100, Cédric Le Goater wrote:
>> Each XIVE interrupt source is associated with a two bit state machine
>> called an Event State Buffer (ESB) : the first bit "P" means that an
>> interrupt is "pending" and waiting for an EOI and the bit "Q" (queued)
>> means a new interrupt was triggered while another was still pending.
>>
>> When an event is triggered, the associated interrupt state bits are
>> fetched and modified and forwarded to the virtualization engine of the
>> controller doing the routing. These can also be controlled by MMIO, to
>> trigger events or turn off the sources for instance. See code for more
>> details on the states and transitions.
>>
>> The MMIO space for the ESBs is 512GB large on the bare-metal system
>> (PowerNV) and the BAR depends on the chip id. In our model for the
>> sPAPR machine, we choose to only map the sub-region for the
>> provisioned IRQ numbers and to use the mapping address of chip 0 of a
>> real system.
>
> I think we probably want a device property to make the virtualized
> base address arbitrary. It's fine for it to default to the chip 0
> base, but that'll make it easier to adapt if we need to later on.
yes. We can add a "bar" property for this purpose like for some of
the pnv models
> As noted in the followup messages, I think you're going to want to
> move this stuff from the current xive object into a "block of sources"
> object.
yes. I have now a new Xive source model for the POWER9 PSIHB controller.
It should help to find common grounds. This is what I added to support
XIVE in the current PSIHB:
+ /* P9 */
+ MemoryRegion esb_iomem;
+ uint8_t sbe[4]; /* enough for 13 P&Q bits */
+ uint32_t ivt_offset;
The ESB region mapping is handled at the machine level as it depends
on the chip id.
The 'ivt_offset' is only used to forward the event notification to
the routine engine :
+static void pnv_psi_notify(PnvPsi *psi, uint32_t lisn)
+{
+ uint64_t notif_port =
+ psi->regs[PSIHB_REG(PSIHB9_ESB_NOTIF_ADDR)];
+ bool valid = notif_port & PSIHB9_ESB_NOTIF_VALID;
+ uint64_t notify_addr = notif_port & ~PSIHB9_ESB_NOTIF_VALID;
+ uint32_t data = cpu_to_be32(psi->ivt_offset | lisn);
+
+ if (valid) {
+ cpu_physical_memory_write(notify_addr, &data, sizeof(data));
+ }
+}
So It really depends on the controller type. I think that could be a
class handler.
Thanks,
C.
> Apart from that this looks pretty sound.
>
>> In the real world, each source may have different characteristics
>> depending on the revision of a controller or the CPU. Early systems
>> had two different MMIO pages for trigger and for EOI. We choose to use
>> the same characteristics for all sources to simplify the model. The
>> minimum CPU level for XIVE exploitation mode will be DD2.X as it has
>> full support.
>>
>> The OS will obtain the address of the MMIO page of the ESB entry
>> associated with a source and its characteristic using the
>> H_INT_GET_SOURCE_INFO hcall. This will be addressed in the patch
>> introducing the hcalls.
>>
>> The spapr_xive_irq() routine in charge of triggering the CPU interrupt
>> line will be filled later on.
>>
>> Signed-off-by: Cédric Le Goater <address@hidden>
>> ---
>>
>> Changes since v1:
>>
>> - merged in the same patch the qemu_irq handlers
>> - reworked the event notification logic of the qemu_irq handlers.
>> - introduced XIVE_ESB_STORE_EOI support
>> - removed 'esb_shift' field
>> - removed a useless check on the validity of the IVE in the memory
>> region handlers.
>> - fixed spapr_xive_pq_trigger() to return true when XIVE_ESB_QUEUED
>> is set
>> - removed the overall ESB memory region. We now have only one region
>> for the provisioned sources.
>> - improved 'info pic' output
>>
>> hw/intc/spapr_xive.c | 254
>> +++++++++++++++++++++++++++++++++++++++++++-
>> hw/intc/xive-internal.h | 10 ++
>> include/hw/ppc/spapr_xive.h | 9 ++
>> 3 files changed, 271 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
>> index e6e8841add17..43df6814619d 100644
>> --- a/hw/intc/spapr_xive.c
>> +++ b/hw/intc/spapr_xive.c
>> @@ -18,23 +18,252 @@
>>
>> #include "xive-internal.h"
>>
>> +static void spapr_xive_irq(sPAPRXive *xive, int lisn)
>> +{
>> +
>> +}
>> +
>> /*
>> - * Main XIVE object
>> + * XIVE Interrupt Source
>> + */
>> +
>> +/*
>> + * "magic" Event State Buffer (ESB) MMIO offsets.
>> + *
>> + * Each interrupt source has a 2-bit state machine called ESB
>> + * which can be controlled by MMIO. It's made of 2 bits, P and
>> + * Q. P indicates that an interrupt is pending (has been sent
>> + * to a queue and is waiting for an EOI). Q indicates that the
>> + * interrupt has been triggered while pending.
>> + *
>> + * This acts as a coalescing mechanism in order to guarantee
>> + * that a given interrupt only occurs at most once in a queue.
>> + *
>> + * When doing an EOI, the Q bit will indicate if the interrupt
>> + * needs to be re-triggered.
>> + *
>> + * The following offsets into the ESB MMIO allow to read or
>> + * manipulate the PQ bits. They must be used with an 8-bytes
>> + * load instruction. They all return the previous state of the
>> + * interrupt (atomically).
>> + *
>> + * Additionally, some ESB pages support doing an EOI via a
>> + * store at 0 and some ESBs support doing a trigger via a
>> + * separate trigger page.
>> + */
>> +#define XIVE_ESB_STORE_EOI 0x400 /* Store */
>> +#define XIVE_ESB_LOAD_EOI 0x000 /* Load */
>> +#define XIVE_ESB_GET 0x800 /* Load */
>> +#define XIVE_ESB_SET_PQ_00 0xc00 /* Load */
>> +#define XIVE_ESB_SET_PQ_01 0xd00 /* Load */
>> +#define XIVE_ESB_SET_PQ_10 0xe00 /* Load */
>> +#define XIVE_ESB_SET_PQ_11 0xf00 /* Load */
>> +
>> +#define XIVE_ESB_VAL_P 0x2
>> +#define XIVE_ESB_VAL_Q 0x1
>> +
>> +#define XIVE_ESB_RESET 0x0
>> +#define XIVE_ESB_PENDING XIVE_ESB_VAL_P
>> +#define XIVE_ESB_QUEUED (XIVE_ESB_VAL_P | XIVE_ESB_VAL_Q)
>> +#define XIVE_ESB_OFF XIVE_ESB_VAL_Q
>> +
>> +static uint8_t spapr_xive_pq_get(sPAPRXive *xive, uint32_t lisn)
>> +{
>> + uint32_t byte = lisn / 4;
>> + uint32_t bit = (lisn % 4) * 2;
>> +
>> + assert(byte < xive->sbe_size);
>> +
>> + return (xive->sbe[byte] >> bit) & 0x3;
>> +}
>> +
>> +static uint8_t spapr_xive_pq_set(sPAPRXive *xive, uint32_t lisn, uint8_t pq)
>> +{
>> + uint32_t byte = lisn / 4;
>> + uint32_t bit = (lisn % 4) * 2;
>> + uint8_t old, new;
>> +
>> + assert(byte < xive->sbe_size);
>> +
>> + old = xive->sbe[byte];
>> +
>> + new = xive->sbe[byte] & ~(0x3 << bit);
>> + new |= (pq & 0x3) << bit;
>> +
>> + xive->sbe[byte] = new;
>> +
>> + return (old >> bit) & 0x3;
>> +}
>> +
>> +static bool spapr_xive_pq_eoi(sPAPRXive *xive, uint32_t lisn)
>> +{
>> + uint8_t old_pq = spapr_xive_pq_get(xive, lisn);
>> +
>> + switch (old_pq) {
>> + case XIVE_ESB_RESET:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_RESET);
>> + return false;
>> + case XIVE_ESB_PENDING:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_RESET);
>> + return false;
>> + case XIVE_ESB_QUEUED:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_PENDING);
>> + return true;
>> + case XIVE_ESB_OFF:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_OFF);
>> + return false;
>> + default:
>> + g_assert_not_reached();
>> + }
>> +}
>> +
>> +/*
>> + * Returns whether the event notification should be forwarded to the
>> + * IVE for routing.
>> */
>> +static bool spapr_xive_pq_trigger(sPAPRXive *xive, uint32_t lisn)
>> +{
>> + uint8_t old_pq = spapr_xive_pq_get(xive, lisn);
>>
>> + switch (old_pq) {
>> + case XIVE_ESB_RESET:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_PENDING);
>> + return true;
>> + case XIVE_ESB_PENDING:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_QUEUED);
>> + return false;
>> + case XIVE_ESB_QUEUED:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_QUEUED);
>> + return false;
>> + case XIVE_ESB_OFF:
>> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_OFF);
>> + return false;
>> + default:
>> + g_assert_not_reached();
>> + }
>> +}
>> +
>> +/*
>> + * XIVE Interrupt Source MMIOs
>> + */
>> +
>> +/*
>> + * Some HW use a separate page for trigger. We only support the case
>> + * in which the trigger can be done in the same page as the EOI.
>> + */
>> +static uint64_t spapr_xive_esb_read(void *opaque, hwaddr addr, unsigned
>> size)
>> +{
>> + sPAPRXive *xive = SPAPR_XIVE(opaque);
>> + uint32_t offset = addr & 0xF00;
>> + uint32_t lisn = addr >> ESB_SHIFT;
>> + uint64_t ret = -1;
>> +
>> + switch (offset) {
>> + case XIVE_ESB_LOAD_EOI:
>> + /*
>> + * EOI on load is not used anymore as we now advertise
>> + * XIVE_ESB_STORE_EOI support for the interrupt sources
>> + */
>> + ret = spapr_xive_pq_eoi(xive, lisn);
>> + break;
>> +
>> + case XIVE_ESB_GET:
>> + ret = spapr_xive_pq_get(xive, lisn);
>> + break;
>> +
>> + case XIVE_ESB_SET_PQ_00:
>> + case XIVE_ESB_SET_PQ_01:
>> + case XIVE_ESB_SET_PQ_10:
>> + case XIVE_ESB_SET_PQ_11:
>> + ret = spapr_xive_pq_set(xive, lisn, (offset >> 8) & 0x3);
>> + break;
>> + default:
>> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid ESB addr %d\n",
>> offset);
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +static void spapr_xive_esb_write(void *opaque, hwaddr addr,
>> + uint64_t value, unsigned size)
>> +{
>> + sPAPRXive *xive = SPAPR_XIVE(opaque);
>> + uint32_t offset = addr & 0xF00;
>> + uint32_t lisn = addr >> ESB_SHIFT;
>> + bool notify = false;
>> +
>> + switch (offset) {
>> + case 0:
>> + notify = spapr_xive_pq_trigger(xive, lisn);
>> + break;
>> + case XIVE_ESB_STORE_EOI:
>> + /* If the Q bit is set, we should forward a new source event
>> + * notification
>> + */
>> + notify = spapr_xive_pq_eoi(xive, lisn);
>> + break;
>> + default:
>> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid ESB write addr %d\n",
>> + offset);
>> + return;
>> + }
>> +
>> + /* Forward the source event notification for routing */
>> + if (notify) {
>> + spapr_xive_irq(xive, lisn);
>> + }
>> +}
>> +
>> +static const MemoryRegionOps spapr_xive_esb_ops = {
>> + .read = spapr_xive_esb_read,
>> + .write = spapr_xive_esb_write,
>> + .endianness = DEVICE_BIG_ENDIAN,
>> + .valid = {
>> + .min_access_size = 8,
>> + .max_access_size = 8,
>> + },
>> + .impl = {
>> + .min_access_size = 8,
>> + .max_access_size = 8,
>> + },
>> +};
>> +
>> +static void spapr_xive_source_set_irq(void *opaque, int lisn, int val)
>> +{
>> + sPAPRXive *xive = SPAPR_XIVE(opaque);
>> + bool notify = false;
>> +
>> + if (val) {
>> + notify = spapr_xive_pq_trigger(xive, lisn);
>> + }
>> +
>> + /* Forward the source event notification for routing */
>> + if (notify) {
>> + spapr_xive_irq(xive, lisn);
>> + }
>> +}
>> +
>> +/*
>> + * Main XIVE object
>> + */
>> void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon)
>> {
>> int i;
>>
>> for (i = 0; i < xive->nr_irqs; i++) {
>> XiveIVE *ive = &xive->ivt[i];
>> + uint8_t pq;
>>
>> if (!(ive->w & IVE_VALID)) {
>> continue;
>> }
>>
>> - monitor_printf(mon, " %4x %s %08x %08x\n", i,
>> + pq = spapr_xive_pq_get(xive, i);
>> +
>> + monitor_printf(mon, " %4x %s %c%c %08x %08x\n", i,
>> ive->w & IVE_MASKED ? "M" : " ",
>> + pq & XIVE_ESB_VAL_P ? 'P' : '-',
>> + pq & XIVE_ESB_VAL_Q ? 'Q' : '-',
>> (int) GETFIELD(IVE_EQ_INDEX, ive->w),
>> (int) GETFIELD(IVE_EQ_DATA, ive->w));
>> }
>> @@ -52,6 +281,9 @@ static void spapr_xive_reset(DeviceState *dev)
>> ive->w |= IVE_MASKED;
>> }
>> }
>> +
>> + /* SBEs are initialized to 0b01 which corresponds to "ints off" */
>> + memset(xive->sbe, 0x55, xive->sbe_size);
>> }
>>
>> static void spapr_xive_realize(DeviceState *dev, Error **errp)
>> @@ -65,6 +297,23 @@ static void spapr_xive_realize(DeviceState *dev, Error
>> **errp)
>>
>> /* Allocate the IVT (Interrupt Virtualization Table) */
>> xive->ivt = g_new0(XiveIVE, xive->nr_irqs);
>> +
>> + /* QEMU IRQs */
>> + xive->qirqs = qemu_allocate_irqs(spapr_xive_source_set_irq, xive,
>> + xive->nr_irqs);
>> +
>> + /* Allocate SBEs (State Bit Entry). 2 bits, so 4 entries per byte */
>> + xive->sbe_size = DIV_ROUND_UP(xive->nr_irqs, 4);
>> + xive->sbe = g_malloc0(xive->sbe_size);
>> +
>> + /* VC BAR. Use address of chip 0 to install the ESB memory region
>> + * for *all* interrupt sources */
>> + xive->esb_base = (P9_MMIO_BASE | VC_BAR_DEFAULT);
>> +
>> + memory_region_init_io(&xive->esb_iomem, OBJECT(xive),
>> + &spapr_xive_esb_ops, xive, "xive.esb",
>> + (1ull << ESB_SHIFT) * xive->nr_irqs);
>> + sysbus_init_mmio(SYS_BUS_DEVICE(dev), &xive->esb_iomem);
>> }
>>
>> static const VMStateDescription vmstate_spapr_xive_ive = {
>> @@ -92,6 +341,7 @@ static const VMStateDescription vmstate_spapr_xive = {
>> VMSTATE_UINT32_EQUAL(nr_irqs, sPAPRXive, NULL),
>> VMSTATE_STRUCT_VARRAY_UINT32(ivt, sPAPRXive, nr_irqs, 1,
>> vmstate_spapr_xive_ive, XiveIVE),
>> + VMSTATE_VBUFFER_UINT32(sbe, sPAPRXive, 1, NULL, sbe_size),
>> VMSTATE_END_OF_LIST()
>> },
>> };
>> diff --git a/hw/intc/xive-internal.h b/hw/intc/xive-internal.h
>> index 132b71a6daf0..872648dd96a2 100644
>> --- a/hw/intc/xive-internal.h
>> +++ b/hw/intc/xive-internal.h
>> @@ -16,6 +16,16 @@
>> #define SETFIELD(m, v, val) \
>> (((v) & ~(m)) | ((((typeof(v))(val)) << MASK_TO_LSH(m)) & (m)))
>>
>> +/*
>> + * XIVE MMIO regions
>> + */
>> +#define P9_MMIO_BASE 0x006000000000000ull
>> +
>> +/* VC BAR contains set translations for the ESBs and the EQs. */
>> +#define VC_BAR_DEFAULT 0x10000000000ull
>> +#define VC_BAR_SIZE 0x08000000000ull
>> +#define ESB_SHIFT 16 /* One 64k page. OPAL has two */
>> +
>> /* IVE/EAS
>> *
>> * One per interrupt source. Targets that interrupt to a given EQ
>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
>> index 5b1f78e06a1e..ecc15d889b74 100644
>> --- a/include/hw/ppc/spapr_xive.h
>> +++ b/include/hw/ppc/spapr_xive.h
>> @@ -24,8 +24,17 @@ struct sPAPRXive {
>> /* Properties */
>> uint32_t nr_irqs;
>>
>> + /* IRQ */
>> + qemu_irq *qirqs;
>> +
>> /* XIVE internal tables */
>> XiveIVE *ivt;
>> + uint8_t *sbe;
>> + uint32_t sbe_size;
>> +
>> + /* ESB memory region */
>> + hwaddr esb_base;
>> + MemoryRegion esb_iomem;
>> };
>>
>> bool spapr_xive_irq_enable(sPAPRXive *xive, uint32_t lisn);
>
- Re: [Qemu-ppc] [PATCH v2 02/19] spapr: introduce a skeleton for the XIVE interrupt controller, (continued)
[Qemu-ppc] [PATCH v2 03/19] spapr: introduce the XIVE interrupt sources, Cédric Le Goater, 2017/12/09
Re: [Qemu-ppc] [PATCH v2 03/19] spapr: introduce the XIVE interrupt sources, David Gibson, 2017/12/20
[Qemu-ppc] [PATCH v2 04/19] spapr: add support for the LSI interrupt sources, Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 05/19] spapr: introduce a XIVE interrupt presenter model, Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 06/19] spapr: introduce the XIVE Event Queues, Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 07/19] spapr: push the XIVE EQ data in OS event queue, Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 08/19] spapr: notify the CPU when the XIVE interrupt priority is more privileged, Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 09/19] spapr: add support for the SET_OS_PENDING command (XIVE), Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 10/19] spapr: introduce a 'xive_exploitation' boolean to enable XIVE, Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 11/19] spapr: add a sPAPRXive object to the machine, Cédric Le Goater, 2017/12/09
[Qemu-ppc] [PATCH v2 12/19] spapr: add hcalls support for the XIVE exploitation interrupt mode, Cédric Le Goater, 2017/12/09