[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 12/13] spapr: add KVM support to the 'dual' m
From: |
David Gibson |
Subject: |
Re: [Qemu-devel] [PATCH v2 12/13] spapr: add KVM support to the 'dual' machine |
Date: |
Wed, 13 Mar 2019 15:18:19 +1100 |
User-agent: |
Mutt/1.11.3 (2019-02-01) |
On Tue, Mar 12, 2019 at 10:40:19AM +0100, Cédric Le Goater wrote:
> On 2/28/19 6:15 AM, David Gibson wrote:
> > On Fri, Feb 22, 2019 at 02:13:21PM +0100, Cédric Le Goater wrote:
> >> The interrupt mode is chosen by the CAS negotiation process and
> >> activated after a reset to take into account the required changes in
> >> the machine. This brings new constraints on how the associated KVM IRQ
> >> device is initialized.
> >>
> >> Currently, each model takes care of the initialization of the KVM
> >> device in their realize method but this is not possible anymore as the
> >> initialization needs to be done globaly when the interrupt mode is
> >> known, i.e. when machine is reseted. It also means that we need a way
> >> to delete a KVM device when another mode is chosen.
> >>
> >> Also, to support migration, the QEMU objects holding the state to
> >> transfer should always be available but not necessarily activated.
> >>
> >> The overall approach of this proposal is to initialize both interrupt
> >> mode at the QEMU level and keep the IRQ number space in sync to allow
> >> switching from one mode to another. For the KVM side of things, the
> >> whole initialization of the KVM device, sources and presenters, is
> >> grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
> >> handlers are modified accordingly to handle the init and the delete
> >> sequences of the KVM device.
> >>
> >> Signed-off-by: Cédric Le Goater <address@hidden>
> >> ---
> >> include/hw/ppc/spapr_xive.h | 1 +
> >> hw/intc/spapr_xive.c | 19 +++++++-
> >> hw/intc/spapr_xive_kvm.c | 27 +++++++++++
> >> hw/intc/xics_kvm.c | 26 ++++++++++
> >> hw/intc/xive.c | 4 --
> >> hw/ppc/spapr_irq.c | 97 ++++++++++++++++++++++++++++---------
> >> 6 files changed, 145 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> >> index a7c4c275a747..a1593ac2fcf0 100644
> >> --- a/include/hw/ppc/spapr_xive.h
> >> +++ b/include/hw/ppc/spapr_xive.h
> >> @@ -66,6 +66,7 @@ void spapr_xive_map_mmio(sPAPRXive *xive);
> >>
> >> int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >> uint32_t *out_server, uint8_t *out_prio);
> >> +void spapr_xive_late_realize(sPAPRXive *xive, Error **errp);
> >>
> >> /*
> >> * KVM XIVE device helpers
> >> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >> index 21fe5e1aa39f..b0cbc2fe21ee 100644
> >> --- a/hw/intc/spapr_xive.c
> >> +++ b/hw/intc/spapr_xive.c
> >> @@ -278,7 +278,6 @@ static void spapr_xive_realize(DeviceState *dev, Error
> >> **errp)
> >> XiveSource *xsrc = &xive->source;
> >> XiveENDSource *end_xsrc = &xive->end_source;
> >> Error *local_err = NULL;
> >> - MachineState *machine = MACHINE(qdev_get_machine());
> >>
> >> if (!xive->nr_irqs) {
> >> error_setg(errp, "Number of interrupt needs to be greater 0");
> >> @@ -329,6 +328,15 @@ static void spapr_xive_realize(DeviceState *dev,
> >> Error **errp)
> >> xive->tm_base + XIVE_TM_USER_PAGE * (1 <<
> >> TM_SHIFT));
> >>
> >> qemu_register_reset(spapr_xive_reset, dev);
> >> +}
> >> +
> >> +void spapr_xive_late_realize(sPAPRXive *xive, Error **errp)
> >> +{
> >> + Error *local_err = NULL;
> >> + MachineState *machine = MACHINE(qdev_get_machine());
> >> + XiveSource *xsrc = &xive->source;
> >> + XiveENDSource *end_xsrc = &xive->end_source;
> >> + static bool once;
> >>
> >> if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
> >> kvmppc_xive_connect(xive, &local_err);
> >> @@ -351,6 +359,15 @@ static void spapr_xive_realize(DeviceState *dev,
> >> Error **errp)
> >> error_report_err(local_err);
> >> }
> >>
> >> + /*
> >> + * TODO: Emulated mode can only be initialized once. Should we
> >> + * store the information under the device model for later usage ?
> >> + */
> >> + if (once) {
> >> + return;
> >> + }
> >> + once = true;
> >
> > Urgh. static locals are a bad smell. I think at least this flag
> > should go into the instance structure (if we can't deduce it from
> > something else in there).
>
> Yes. I have the same feeling. Maybe we can deduce it from some
> object state. I will check.
Ok.
> >> +
> >> /* TIMA initialization */
> >> memory_region_init_io(&xive->tm_mmio, OBJECT(xive), &xive_tm_ops,
> >> xive,
> >> "xive.tima", 4ull << TM_SHIFT);
> >> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> >> index cd81cdb23a5e..99a829fb3f60 100644
> >> --- a/hw/intc/spapr_xive_kvm.c
> >> +++ b/hw/intc/spapr_xive_kvm.c
> >> @@ -657,6 +657,15 @@ void kvmppc_xive_connect(sPAPRXive *xive, Error
> >> **errp)
> >> Error *local_err = NULL;
> >> size_t esb_len = (1ull << xsrc->esb_shift) * xsrc->nr_irqs;
> >> size_t tima_len = 4ull << TM_SHIFT;
> >> + CPUState *cs;
> >> +
> >> + /*
> >> + * The KVM XIVE device already in use. This is the case when
> >> + * rebooting XIVE -> XIVE
> >
> > I take it from this we're leaving the XIVE KVM device initialized if
> > we don't change to XICS during CAS.
>
> No. This is for a XIVE only machine rebooting, which uses the same code.
Ok.
> > Would it make things simpler if
> > we always removed both the XICS and XIVE KVM devices at reset and
> > recreated the one we need at CAS?
>
> On a 'dual' mode machine, we remove both XICS and XIVE KVM devices at
> reset and (re)create the XICS KVM device by default because the OV5 is
> reseted to its default. From there, CAS negotiates a new interrupt mode,
> or nor.
>
> It would be good to keep the XIVE mode activated once selected by CAS
> because it is unlikely to be changed afterwards. I haven't worked on
> that optimization yet.
Yes, handling this as a later optimization is just fine by me.
I'm even fine destroying and recreating the PIC in non-dual mode if it
makes the overall code simpler.
> >> @@ -462,14 +495,8 @@ static sPAPRIrq *spapr_irq_current(sPAPRMachineState
> >> *spapr)
> >> static void spapr_irq_init_dual(sPAPRMachineState *spapr, int nr_irqs,
> >> Error **errp)
> >> {
> >> - MachineState *machine = MACHINE(spapr);
> >> Error *local_err = NULL;
> >>
> >> - if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
> >> - error_setg(errp, "No KVM support for the 'dual' machine");
> >> - return;
> >> - }
> >> -
> >> spapr_irq_xics.init(spapr, spapr_irq_xics.nr_irqs, &local_err);
> >> if (local_err) {
> >> error_propagate(errp, local_err);
> >> @@ -548,6 +575,9 @@ static int spapr_irq_post_load_dual(sPAPRMachineState
> >> *spapr, int version_id)
> >> * defaults to XICS at startup.
> >> */
> >> if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
> >> + if (kvm_irqchip_in_kernel()) {
> >> + xics_kvm_disconnect(spapr, &error_fatal);
> >
> > Yeah, this is kinda nasty. I'm wondering if we could make things
> > simpler by always using emulated irqchip until CAS and only setting up
> > kernel irqchip then.
>
> ah. good idea. We could be running with an emulated device before CAS
> and switch to the KVM device once a mode has been selected. I even think
> that an emulated XICS device and a KVM XIVE device can cohabite.
> I need to look closer at this idea. We might have to add some 'bool' to
> know at which stage the machine is running : before or after CAS.
I thought we already had some places testing for that, but now I can't
find them :/. Possibly we could have ov5_case just be NULL from reset
until CAS.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature