[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH v2 00/19] spapr: Guest exploitation of the XIVE inte
Cédric Le Goater
[Qemu-devel] [PATCH v2 00/19] spapr: Guest exploitation of the XIVE interrupt controller (POWER9)
Sat, 9 Dec 2017 09:43:19 +0100
On a POWER9 sPAPR machine, the Client Architecture Support (CAS)
negotiation process determines whether the guest operates with an
interrupt controller using the XICS legacy model, as found on POWER8,
or in XIVE exploitation mode, the newer POWER9 interrupt model. XIVE
is a complex interrupt controller introducing a large number of new
features, for virtualization in particular.
It is composed of three sub-engines :
- Interrupt Virtualization Source Engine (IVSE). These are in PHBs,
in the main controller for the IPIS and in the PSI host
bridge. They are configured to feed the IVRE with events.
- Interrupt Virtualization Routing Engine (IVRE). Their job is to
match an event source with a Notification Virtualization Target
(NVT), a priority and an Event Queue (EQ) to determine if a
Virtual Processor can handle the event.
- Interrupt Virtualization Presentation Engine (IVPE). It maintains
the interrupt state of each hardware thread and present the
notification as an external exception.
Each of the engines uses a set of internal tables to redirect
exceptions from event sources to CPU threads. Interrupt sources have a
2-bit state machine, the Event State Buffer (ESB), that allows events
to be triggered. If the event is let through, the IVRE looks up in the
Interrupt Virtualization Entry (IVE) table for the Event Queue
Descriptor configured for the source. Each Event Queue Descriptor
defines a notification path to a CPU and an in-memory queue in which
will be recorded an event identifier for the OS to pull.
The high level ideas of the current design are :
- introduce a persistent XIVE object under the sPAPR machine for
newer machines and let the CAS negotiation process decide whether
it should be used or not. Use the 'ov5_cas' attribute for this
- introduce a persistent XIVE interrupt presenter under the sPAPR
core and switch ICP after CAS. Each core has now two ICPs, one
active through the 'intc' pointer and another one among its
children ready to be used if the guest requires it.
- move the XIVE EQs under the cores to simplify the XIVE model
- allocate the CPU IPIs at the beginning of the IRQ number space to
be compatible with XICS (which starts at 4096) and also to simplify
the model. This means that the XIVE model covers the whole IRQ
number space. There are no offset like in XICS splitting the IRQ
The patchset first introduces new models for XIVE :
- sPAPRXive holding the internal tables and the MMIO regions used by
the XIVE controller.
- sPAPRXiveNVT object storing the interrupt state of the CPU and
acting as the XIVE interrupt presenter
then, describes the notification process and the interrupt delivery to
It finishes with the integration of sPAPRXive object under the sPAPR
machine, the introducion of the new XIVE hcalls, the device tree
layout, and the necessary adjustments to support the CAS negotiation.
Migration is addressed, CPU hotplug, and support for older machines
and QEMU versions also. KVM support is not addressed yet and the guest
needs to be run with kernel_irqchip=off on a POWER9 system.
Code is here:
Changes since v1 :
- used g_new0 instead of g_malloc0
- removed VMSTATE_STRUCT_VARRAY_UINT32_ALLOC
- introduced a device reset handler. the object needs to be parented
to sysbus when created.
- renamed spapr_xive_irq_set() to spapr_xive_irq_enable()
- renamed spapr_xive_irq_unset() to spapr_xive_irq_disable()
- moved the PPC_BIT macros under target/ppc/cpu.h
- shrinked file copyright header
- reworked the event notification logic of the qemu_irq handlers.
- introduced XIVE_ESB_STORE_EOI support
- removed 'esb_shift' field
- removed a useless check on the validity of the IVE in the memory
- removed the overall ESB memory region. We now have only one region
for the provisioned sources.
- improved 'info pic' output
- improved LSI support
- renamed 'sPAPRXiveICP' to 'sPAPRXiveNVT'
- renamed 'tima' field to 'regs'
- renamed 'tima_os' fiels to 'ring_os'
- removed 'tm_shift' field
- introduced a memory region to model the User TIMA and another one
for the OS TIMA. One page size for each.
- removed useless checks in the memory region handlers
- removed support for 970 ...
- removed spapr_xive_eq_for_server() which did the EQ indexing.
- changed spapr_xive_get_eq() to use a server and a priority parameter
- introduced a couple of macro for the EQ indexing.
- replaced dma_memory_write() by stl_be_dma()
- set initial TM_PIPR to 0xFF in sPAPRXiveNVT
- conditioned the creation of the sPAPRXive object to the
xive_exploitation bool which false on older pseries machine.
- parented the sPAPRXive object to sysbus.
- simplified priority_is_valid() routine (to its minimum)
- used PPC_BIT() macros to define the hcall flags
- removed useless casts
- defined the default characteristic of the single XIVE interrupt
source to be : *XIVE_SRC_TRIGGER | XIVE_SRC_STORE_EOI*
- removed EQ_W0_UCOND_NOTIFY when the EQ is reseted
- fixed XIVE_EQ_DEBUG support. Offset for the generation bit was wrong
- added a unit id to the nodename
- added properties for the LSIs
- simplified the array for the "ibm,plat-res-int-priorities" property
- renamed spapr_xive_populate() to spapr_dt_xive()
- moved the mapping of the XIVE memory region and the setting
of the ICP under the machine reset handler.
- introduced a spapr_xive_qirq() helper
- introduced a spapr_xive_nvt_create() helper
- handled more errors in spapr_post_load() to return EINVAL
Cédric Le Goater (19):
dma-helpers: add a return value to store helpers
spapr: introduce a skeleton for the XIVE interrupt controller
spapr: introduce the XIVE interrupt sources
spapr: add support for the LSI interrupt sources
spapr: introduce a XIVE interrupt presenter model
spapr: introduce the XIVE Event Queues
spapr: push the XIVE EQ data in OS event queue
spapr: notify the CPU when the XIVE interrupt priority is more
spapr: add support for the SET_OS_PENDING command (XIVE)
spapr: introduce a 'xive_exploitation' boolean to enable XIVE
spapr: add a sPAPRXive object to the machine
spapr: add hcalls support for the XIVE exploitation interrupt mode
spapr: add device tree support for the XIVE interrupt mode
spapr: introduce a helper to map the XIVE memory regions
spapr: add XIVE support to spapr_qirq()
spapr: introduce a spapr_icp_create() helper
spapr: toggle the ICP depending on the selected interrupt mode
spapr: add support to dump XIVE information
spapr: advertise XIVE exploitation mode in CAS
default-configs/ppc64-softmmu.mak | 1 +
hw/intc/Makefile.objs | 1 +
hw/intc/spapr_xive.c | 1013 +++++++++++++++++++++++++++++++++++++
hw/intc/spapr_xive_hcall.c | 923 +++++++++++++++++++++++++++++++++
hw/intc/xive-internal.h | 196 +++++++
hw/ppc/spapr.c | 188 ++++++-
hw/ppc/spapr_cpu_core.c | 37 +-
hw/ppc/spapr_hcall.c | 6 +
include/hw/ppc/spapr.h | 20 +-
include/hw/ppc/spapr_cpu_core.h | 1 +
include/hw/ppc/spapr_xive.h | 72 +++
include/sysemu/dma.h | 4 +-
12 files changed, 2449 insertions(+), 13 deletions(-)
create mode 100644 hw/intc/spapr_xive.c
create mode 100644 hw/intc/spapr_xive_hcall.c
create mode 100644 hw/intc/xive-internal.h
create mode 100644 include/hw/ppc/spapr_xive.h
- [Qemu-devel] [PATCH v2 00/19] spapr: Guest exploitation of the XIVE interrupt controller (POWER9),
Cédric Le Goater <=