[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH] spapr/xive: Allocate vCPU IPIs from local context
From: |
Greg Kurz |
Subject: |
Re: [RFC PATCH] spapr/xive: Allocate vCPU IPIs from local context |
Date: |
Fri, 24 Sep 2021 15:49:06 +0200 |
On Fri, 24 Sep 2021 14:40:24 +0200
Cédric Le Goater <clg@kaod.org> wrote:
> On 9/23/21 11:12, Greg Kurz wrote:
> > On Wed, 22 Sep 2021 16:41:20 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> >
> >> When QEMU switches to the XIVE interrupt mode, it creates all possible
> >> guest interrupts at the level of the KVM device. These interrupts are
> >> backed by real HW interrupts from the IPI interrupt pool of the XIVE
> >> controller.
> >>
> >> Currently, this is done from the QEMU main thread, which results in
> >> allocating all interrupts from the chip on which QEMU is running. IPIs
> >> are not distributed across the system and the load is not well
> >> balanced across the interrupt controllers.
> >>
> >> To improve distribution on the system, we should try to allocate the
> >> underlying HW IPI from the vCPU context. This also benefits to
> >> configurations using CPU pinning. In this case, the HW IPI is
> >> allocated on the local chip, rerouting between interrupt controllers
> >> is reduced and performance improved.
> >>
> >> This moves the initialization of the vCPU IPIs from reset time to the
> >> H_INT_SET_SOURCE_CONFIG hcall which is called from the vCPU context.
> >> But this needs some extra checks in the sequences getting and setting
> >> the source states to make sure a valid HW IPI is backing the guest
> >> interrupt. For that, we check if a target was configured in the END in
> >> case of a vCPU IPI.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>
> >> I have tested different SMT configurations, kernel_irqchip=off/on,
> >> did some migrations, CPU hotplug, etc. It's not enough and I would
> >> like more testing but, at least, it is not making anymore the bad
> >> assumption vCPU id = IPI number.
> >>
> >
> > Yeah, the IPI number is provided by the guest, so h_int_set_source_config()
> > is really the only place where we can know the IPI number of a given vCPU.
>
> The patch lacks a run_on_cpu() to perform the reset on the vCPU context
> to be complete.
>
Yes since the vCPU doing the hcall is obviously not the target for the
IPI :-)
> >
> >> Comments ?
> >>
> >
> > LGTM but I didn't check if more users of xive_end_is_valid() should
> > be converted to using xive_source_is_initialized().
>
> I think you mean xive_eas_is_valid() ?
>
Oops yes... bad copy/paste :-\
> The changes only impact KVM support since we are deferring the IRQ
> initialization at the KVM level. What we have to be careful about is not
> accessing an ESB page of an interrupt that would not have been initiliazed
> in the KVM device, else QEMU gets a sigbus.
>
Ok, so this is just needed on a path that leads to xive_esb_rw() or
kvmppc_xive_esb_trigger(), right ?
It seems that
h_int_esb()
kvmppc_xive_esb_rw()
can get there with an lisn provided by the guest, and I don't see any
such check on the way : h_int_esb() only checks xive_eas_is_valid().
Cheers,
--
Greg
> That only happens when QEMU gets/sets the ESB states.
>
> > Any chance you have some perf numbers to share ?
>
> I will try.
>
> Thanks,
>
> C.
>
>
> >> hw/intc/spapr_xive.c | 17 +++++++++++++++++
> >> hw/intc/spapr_xive_kvm.c | 36 +++++++++++++++++++++++++++++++-----
> >> 2 files changed, 48 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >> index 6f31ce74f198..2dc594a720b1 100644
> >> --- a/hw/intc/spapr_xive.c
> >> +++ b/hw/intc/spapr_xive.c
> >> @@ -1089,6 +1089,23 @@ static target_ulong
> >> h_int_set_source_config(PowerPCCPU *cpu,
> >> if (spapr_xive_in_kernel(xive)) {
> >> Error *local_err = NULL;
> >>
> >> + /*
> >> + * Initialize the vCPU IPIs from the vCPU context to allocate
> >> + * the backing HW IPI on the local chip. This improves
> >> + * distribution of the IPIs in the system and when the vCPUs
> >> + * are pinned, it reduces rerouting between interrupt
> >> + * controllers for better performance.
> >> + */
> >> + if (lisn < SPAPR_XIRQ_BASE) {
> >> + XiveSource *xsrc = &xive->source;
> >> +
> >> + kvmppc_xive_source_reset_one(xsrc, lisn, &local_err);
> >> + if (local_err) {
> >> + error_report_err(local_err);
> >> + return H_HARDWARE;
> >> + }
> >> + }
> >> +
> >> kvmppc_xive_set_source_config(xive, lisn, &new_eas, &local_err);
> >> if (local_err) {
> >> error_report_err(local_err);
> >> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> >> index 53731d158625..1607a59e9483 100644
> >> --- a/hw/intc/spapr_xive_kvm.c
> >> +++ b/hw/intc/spapr_xive_kvm.c
> >> @@ -254,7 +254,12 @@ static int kvmppc_xive_source_reset(XiveSource *xsrc,
> >> Error **errp)
> >> SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> >> int i;
> >>
> >> - for (i = 0; i < xsrc->nr_irqs; i++) {
> >> + /*
> >> + * vCPU IPIs are initialized at the KVM level when configured by
> >> + * H_INT_SET_SOURCE_CONFIG.
> >> + */
> >> +
> >> + for (i = SPAPR_XIRQ_BASE; i < xsrc->nr_irqs; i++) {
> >> int ret;
> >>
> >> if (!xive_eas_is_valid(&xive->eat[i])) {
> >> @@ -342,6 +347,27 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int
> >> srcno, uint32_t offset,
> >> }
> >> }
> >>
> >> +static bool xive_source_is_initialized(SpaprXive *xive, int lisn)
> >> +{
> >> + assert(lisn < xive->nr_irqs);
> >> +
> >> + if (!xive_eas_is_valid(&xive->eat[lisn])) {
> >> + return false;
> >> + }
> >> +
> >> + /*
> >> + * vCPU IPIs are initialized at the KVM level when configured by
> >> + * H_INT_SET_SOURCE_CONFIG, in which case, we should have a valid
> >> + * target (server, priority) in the END.
> >> + */
> >> + if (lisn < SPAPR_XIRQ_BASE) {
> >> + return !!xive_get_field64(EAS_END_INDEX, xive->eat[lisn].w);
> >> + }
> >> +
> >> + /* Device sources */
> >> + return true;
> >> +}
> >> +
> >> static void kvmppc_xive_source_get_state(XiveSource *xsrc)
> >> {
> >> SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> >> @@ -350,7 +376,7 @@ static void kvmppc_xive_source_get_state(XiveSource
> >> *xsrc)
> >> for (i = 0; i < xsrc->nr_irqs; i++) {
> >> uint8_t pq;
> >>
> >> - if (!xive_eas_is_valid(&xive->eat[i])) {
> >> + if (!xive_source_is_initialized(xive, i)) {
> >> continue;
> >> }
> >>
> >> @@ -533,7 +559,7 @@ static void kvmppc_xive_change_state_handler(void
> >> *opaque, bool running,
> >> uint8_t pq;
> >> uint8_t old_pq;
> >>
> >> - if (!xive_eas_is_valid(&xive->eat[i])) {
> >> + if (!xive_source_is_initialized(xive, i)) {
> >> continue;
> >> }
> >>
> >> @@ -561,7 +587,7 @@ static void kvmppc_xive_change_state_handler(void
> >> *opaque, bool running,
> >> for (i = 0; i < xsrc->nr_irqs; i++) {
> >> uint8_t pq;
> >>
> >> - if (!xive_eas_is_valid(&xive->eat[i])) {
> >> + if (!xive_source_is_initialized(xive, i)) {
> >> continue;
> >> }
> >>
> >> @@ -666,7 +692,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int
> >> version_id)
> >>
> >> /* Restore the EAT */
> >> for (i = 0; i < xive->nr_irqs; i++) {
> >> - if (!xive_eas_is_valid(&xive->eat[i])) {
> >> + if (!xive_source_is_initialized(xive, i)) {
> >> continue;
> >> }
> >>
> >
>