qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 12/13] spapr: add KVM support to the 'dual' m


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v2 12/13] spapr: add KVM support to the 'dual' machine
Date: Wed, 13 Mar 2019 15:18:19 +1100
User-agent: Mutt/1.11.3 (2019-02-01)

On Tue, Mar 12, 2019 at 10:40:19AM +0100, Cédric Le Goater wrote:
> On 2/28/19 6:15 AM, David Gibson wrote:
> > On Fri, Feb 22, 2019 at 02:13:21PM +0100, Cédric Le Goater wrote:
> >> The interrupt mode is chosen by the CAS negotiation process and
> >> activated after a reset to take into account the required changes in
> >> the machine. This brings new constraints on how the associated KVM IRQ
> >> device is initialized.
> >>
> >> Currently, each model takes care of the initialization of the KVM
> >> device in their realize method but this is not possible anymore as the
> >> initialization needs to be done globaly when the interrupt mode is
> >> known, i.e. when machine is reseted. It also means that we need a way
> >> to delete a KVM device when another mode is chosen.
> >>
> >> Also, to support migration, the QEMU objects holding the state to
> >> transfer should always be available but not necessarily activated.
> >>
> >> The overall approach of this proposal is to initialize both interrupt
> >> mode at the QEMU level and keep the IRQ number space in sync to allow
> >> switching from one mode to another. For the KVM side of things, the
> >> whole initialization of the KVM device, sources and presenters, is
> >> grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
> >> handlers are modified accordingly to handle the init and the delete
> >> sequences of the KVM device.
> >>
> >> Signed-off-by: Cédric Le Goater <address@hidden>
> >> ---
> >>  include/hw/ppc/spapr_xive.h |  1 +
> >>  hw/intc/spapr_xive.c        | 19 +++++++-
> >>  hw/intc/spapr_xive_kvm.c    | 27 +++++++++++
> >>  hw/intc/xics_kvm.c          | 26 ++++++++++
> >>  hw/intc/xive.c              |  4 --
> >>  hw/ppc/spapr_irq.c          | 97 ++++++++++++++++++++++++++++---------
> >>  6 files changed, 145 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> >> index a7c4c275a747..a1593ac2fcf0 100644
> >> --- a/include/hw/ppc/spapr_xive.h
> >> +++ b/include/hw/ppc/spapr_xive.h
> >> @@ -66,6 +66,7 @@ void spapr_xive_map_mmio(sPAPRXive *xive);
> >>  
> >>  int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >>                               uint32_t *out_server, uint8_t *out_prio);
> >> +void spapr_xive_late_realize(sPAPRXive *xive, Error **errp);
> >>  
> >>  /*
> >>   * KVM XIVE device helpers
> >> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >> index 21fe5e1aa39f..b0cbc2fe21ee 100644
> >> --- a/hw/intc/spapr_xive.c
> >> +++ b/hw/intc/spapr_xive.c
> >> @@ -278,7 +278,6 @@ static void spapr_xive_realize(DeviceState *dev, Error 
> >> **errp)
> >>      XiveSource *xsrc = &xive->source;
> >>      XiveENDSource *end_xsrc = &xive->end_source;
> >>      Error *local_err = NULL;
> >> -    MachineState *machine = MACHINE(qdev_get_machine());
> >>  
> >>      if (!xive->nr_irqs) {
> >>          error_setg(errp, "Number of interrupt needs to be greater 0");
> >> @@ -329,6 +328,15 @@ static void spapr_xive_realize(DeviceState *dev, 
> >> Error **errp)
> >>                             xive->tm_base + XIVE_TM_USER_PAGE * (1 << 
> >> TM_SHIFT));
> >>  
> >>      qemu_register_reset(spapr_xive_reset, dev);
> >> +}
> >> +
> >> +void spapr_xive_late_realize(sPAPRXive *xive, Error **errp)
> >> +{
> >> +    Error *local_err = NULL;
> >> +    MachineState *machine = MACHINE(qdev_get_machine());
> >> +    XiveSource *xsrc = &xive->source;
> >> +    XiveENDSource *end_xsrc = &xive->end_source;
> >> +    static bool once;
> >>  
> >>      if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
> >>          kvmppc_xive_connect(xive, &local_err);
> >> @@ -351,6 +359,15 @@ static void spapr_xive_realize(DeviceState *dev, 
> >> Error **errp)
> >>          error_report_err(local_err);
> >>      }
> >>  
> >> +    /*
> >> +     * TODO: Emulated mode can only be initialized once. Should we
> >> +     * store the information under the device model for later usage ?
> >> +     */
> >> +    if (once) {
> >> +        return;
> >> +    }
> >> +    once = true;
> > 
> > Urgh.  static locals are a bad smell.  I think at least this flag
> > should go into the instance structure (if we can't deduce it from
> > something else in there).
> 
> Yes. I have the same feeling. Maybe we can deduce it from some 
> object state. I will check.

Ok.

> >> +
> >>      /* TIMA initialization */
> >>      memory_region_init_io(&xive->tm_mmio, OBJECT(xive), &xive_tm_ops, 
> >> xive,
> >>                            "xive.tima", 4ull << TM_SHIFT);
> >> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> >> index cd81cdb23a5e..99a829fb3f60 100644
> >> --- a/hw/intc/spapr_xive_kvm.c
> >> +++ b/hw/intc/spapr_xive_kvm.c
> >> @@ -657,6 +657,15 @@ void kvmppc_xive_connect(sPAPRXive *xive, Error 
> >> **errp)
> >>      Error *local_err = NULL;
> >>      size_t esb_len = (1ull << xsrc->esb_shift) * xsrc->nr_irqs;
> >>      size_t tima_len = 4ull << TM_SHIFT;
> >> +    CPUState *cs;
> >> +
> >> +    /*
> >> +     * The KVM XIVE device already in use. This is the case when
> >> +     * rebooting XIVE -> XIVE
> > 
> > I take it from this we're leaving the XIVE KVM device initialized if
> > we don't change to XICS during CAS. 
> 
> No. This is for a XIVE only machine rebooting, which uses the same code. 

Ok.

> > Would it make things simpler if
> > we always removed both the XICS and XIVE KVM devices at reset and
> > recreated the one we need at CAS?
> 
> On a 'dual' mode machine, we remove both XICS and XIVE KVM devices at
> reset and (re)create the XICS KVM device by default because the OV5 is 
> reseted to its default. From there, CAS negotiates a new interrupt mode,
> or nor.
> 
> It would be good to keep the XIVE mode activated once selected by CAS
> because it is unlikely to be changed afterwards. I haven't worked on 
> that optimization yet.

Yes, handling this as a later optimization is just fine by me.

I'm even fine destroying and recreating the PIC in non-dual mode if it
makes the overall code simpler.

> >> @@ -462,14 +495,8 @@ static sPAPRIrq *spapr_irq_current(sPAPRMachineState 
> >> *spapr)
> >>  static void spapr_irq_init_dual(sPAPRMachineState *spapr, int nr_irqs,
> >>                                  Error **errp)
> >>  {
> >> -    MachineState *machine = MACHINE(spapr);
> >>      Error *local_err = NULL;
> >>  
> >> -    if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
> >> -        error_setg(errp, "No KVM support for the 'dual' machine");
> >> -        return;
> >> -    }
> >> -
> >>      spapr_irq_xics.init(spapr, spapr_irq_xics.nr_irqs, &local_err);
> >>      if (local_err) {
> >>          error_propagate(errp, local_err);
> >> @@ -548,6 +575,9 @@ static int spapr_irq_post_load_dual(sPAPRMachineState 
> >> *spapr, int version_id)
> >>       * defaults to XICS at startup.
> >>       */
> >>      if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
> >> +        if (kvm_irqchip_in_kernel()) {
> >> +            xics_kvm_disconnect(spapr, &error_fatal);
> > 
> > Yeah, this is kinda nasty.  I'm wondering if we could make things
> > simpler by always using emulated irqchip until CAS and only setting up
> > kernel irqchip then.
> 
> ah. good idea. We could be running with an emulated device before CAS
> and switch to the KVM device once a mode has been selected. I even think 
> that an emulated XICS device and a KVM XIVE device can cohabite.

> I need to look closer at this idea. We might have to add some 'bool' to
> know at which stage the machine is running : before or after CAS.

I thought we already had some places testing for that, but now I can't
find them :/.  Possibly we could have ov5_case just be NULL from reset
until CAS.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]