qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and


From: Greg Kurz
Subject: [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids
Date: Fri, 20 Nov 2020 18:46:38 +0100

A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2
RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU
ids, which happen to be numerically equal in general, but are really
different entities that can diverge in some setups. When this happens,
we end up misconfiguring XIVE in a way that is almost fatal for the
guest.

The confusion comes from XICS which has historically assumed equality
between interrupt server numbers and vCPU ids, as mentionned in a
comment back from 2011 in the linux kernel icp_native_init_one_node()
function:

    /* This code does the theorically broken assumption that the interrupt
     * server numbers are the same as the hard CPU numbers.
     * This happens to be the case so far but we are playing with fire...
     * should be fixed one of these days. -BenH.
     */

This assumption crept into QEMU through the "ibm,interrupt-server-ranges"
property of the "interrupt-controller" node in the DT. This property
contains ranges of consecutive vCPU ids defined as (first id, # of ids).
In the case of QEMU, we define a single range starting from 0 up to the
highest vCPU id, as returned by spapr_max_server_number(). This has
always been associated to the "nr_servers" wording when naming variables
or function arguments. When XIVE got added, we introduced an sPAPR IRQ
abstraction to be able to control several interrupt controller backends.
The sPAPR IRQ base class provides a dt() handler used to populate the
"interrupt-controller" node in the DT. This handler takes an "nr_server"
argument inherited from XICS and we ended up using it to populate the
"ibm,xive-lisn-ranges" property used with XIVE, which has completely
different semantics. It contain ranges of interrupt numbers that the
guest can use for any purpose. Since one obvious purpose is IPI, its
first range should just be able to accomodate all possible vCPUs.
In the case of QEMU, we define a single range starting from 0 up
to "nr_server" but we should rather size it to the number of vCPUs
actually (ie. smp.max_cpus).

This series aims at getting rid of the "nr_server" argument in the
sPAPR IC handlers. Since both the highest possible vCPU id and the
maximum number of vCPUs are invariants for XICS and XIVE respectively,
let's make them device properties to be configured by the machine when
it creates the interrupt controllers and use them where needed.

This doesn't cause any visible change to setups using the default
VSMT machine settings. This changes "ibm,xive-lisn-ranges" for
setups that mess with VSMT, but this is acceptable since linux
only allocates one interrupt per vCPU and the higher part of the
range was never used.

[1] 
https://git.qemu.org/?p=qemu.git;a=commit;h=6d24795ee7e3199d199d3c415312c93382ad1807

Greg Kurz (8):
  spapr/xive: Turn some sanity checks into assertions
  spapr/xive: Introduce spapr_xive_nr_ends()
  spapr/xive: Add "nr-servers" property
  spapr/xive: Add "nr-ipis" property
  spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
  spapr/xics: Add "nr-servers" property
  spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
  spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation

 include/hw/ppc/spapr.h      |  4 +--
 include/hw/ppc/spapr_irq.h  |  9 ++---
 include/hw/ppc/spapr_xive.h | 25 ++++++++++++-
 include/hw/ppc/xics_spapr.h | 23 +++++++++---
 hw/intc/spapr_xive.c        | 72 ++++++++++++++++++++++---------------
 hw/intc/spapr_xive_kvm.c    |  4 +--
 hw/intc/xics_kvm.c          |  4 +--
 hw/intc/xics_spapr.c        | 45 ++++++++++++++---------
 hw/ppc/spapr.c              |  7 ++--
 hw/ppc/spapr_irq.c          | 27 +++++++-------
 10 files changed, 141 insertions(+), 79 deletions(-)

-- 
2.26.2





reply via email to

[Prev in Thread] Current Thread [Next in Thread]