qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH v10 2/6] ppc: spapr: Introduce FWNMI capability


From: David Gibson
Subject: Re: [Qemu-ppc] [PATCH v10 2/6] ppc: spapr: Introduce FWNMI capability
Date: Thu, 4 Jul 2019 11:07:14 +1000
User-agent: Mutt/1.12.0 (2019-05-25)

On Wed, Jul 03, 2019 at 02:58:24PM +0530, Aravinda Prasad wrote:
> 
> 
> On Wednesday 03 July 2019 08:33 AM, David Gibson wrote:
> > On Tue, Jul 02, 2019 at 11:54:26AM +0530, Aravinda Prasad wrote:
> >>
> >>
> >> On Tuesday 02 July 2019 09:21 AM, David Gibson wrote:
> >>> On Wed, Jun 12, 2019 at 02:51:04PM +0530, Aravinda Prasad wrote:
> >>>> Introduce the KVM capability KVM_CAP_PPC_FWNMI so that
> >>>> the KVM causes guest exit with NMI as exit reason
> >>>> when it encounters a machine check exception on the
> >>>> address belonging to a guest. Without this capability
> >>>> enabled, KVM redirects machine check exceptions to
> >>>> guest's 0x200 vector.
> >>>>
> >>>> This patch also introduces fwnmi-mce capability to
> >>>> deal with the case when a guest with the
> >>>> KVM_CAP_PPC_FWNMI capability enabled is attempted
> >>>> to migrate to a host that does not support this
> >>>> capability.
> >>>>
> >>>> Signed-off-by: Aravinda Prasad <address@hidden>
> >>>> ---
> >>>>  hw/ppc/spapr.c         |    1 +
> >>>>  hw/ppc/spapr_caps.c    |   26 ++++++++++++++++++++++++++
> >>>>  include/hw/ppc/spapr.h |    4 +++-
> >>>>  target/ppc/kvm.c       |   19 +++++++++++++++++++
> >>>>  target/ppc/kvm_ppc.h   |   12 ++++++++++++
> >>>>  5 files changed, 61 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>> index 6dd8aaa..2ef86aa 100644
> >>>> --- a/hw/ppc/spapr.c
> >>>> +++ b/hw/ppc/spapr.c
> >>>> @@ -4360,6 +4360,7 @@ static void spapr_machine_class_init(ObjectClass 
> >>>> *oc, void *data)
> >>>>      smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
> >>>>      smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
> >>>>      smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF;
> >>>> +    smc->default_caps.caps[SPAPR_CAP_FWNMI_MCE] = SPAPR_CAP_OFF;
> >>>>      spapr_caps_add_properties(smc, &error_abort);
> >>>>      smc->irq = &spapr_irq_dual;
> >>>>      smc->dr_phb_enabled = true;
> >>>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> >>>> index 31b4661..2e92eb6 100644
> >>>> --- a/hw/ppc/spapr_caps.c
> >>>> +++ b/hw/ppc/spapr_caps.c
> >>>> @@ -479,6 +479,22 @@ static void cap_ccf_assist_apply(SpaprMachineState 
> >>>> *spapr, uint8_t val,
> >>>>      }
> >>>>  }
> >>>>  
> >>>> +static void cap_fwnmi_mce_apply(SpaprMachineState *spapr, uint8_t val,
> >>>> +                                Error **errp)
> >>>> +{
> >>>> +    if (!val) {
> >>>> +        return; /* Disabled by default */
> >>>> +    }
> >>>> +
> >>>> +    if (tcg_enabled()) {
> >>>> +        error_setg(errp,
> >>>> +"No Firmware Assisted Non-Maskable Interrupts support in TCG, try 
> >>>> cap-fwnmi-mce=off");
> >>>
> >>> Not allowing this for TCG creates an awkward incompatibility between
> >>> KVM and TCG guests.  I can't actually see any reason to ban it for TCG
> >>> - with the current code TCG won't ever generate NMIs, but I don't see
> >>> that anything will actually break.
> >>>
> >>> In fact, we do have an nmi monitor command, currently wired to the
> >>> spapr_nmi() function which resets each cpu, but it probably makes
> >>> sense to wire it up to the fwnmi stuff when present.
> >>
> >> Yes, but that nmi support is not enough to inject a synchronous error
> >> into the guest kernel. For example, we should provide the faulty address
> >> along with other information such as the type of error (slb multi-hit,
> >> memory error, TLB multi-hit) and when the error occurred (load/store)
> >> and whether the error was completely recovered or not. Without such
> >> information we cannot build the error log and pass it on to the guest
> >> kernel. Right now nmi monitor command takes cpu number as the only 
> >> argument.
> > 
> > Obviously we can't inject an arbitrary MCE event with that monitor
> > command.  But isn't there some sort of catch-all / unknown type of MCE
> > event which we could inject?
> 
> We have "unknown" type of error, but we should also pass an address in
> the MCE event log. Strictly speaking this address should be a valid
> address in the current CPU context as MCEs are synchronous errors
> triggered when we touch a bad address.

Well, some of them are.  At least historically both synchronous and
asnchronous MCEs were possible.  Are there really no versions where
you can report an MCE with unknown address?

> We can pass a default address with every nmi, but I am not sure whether
> that will be practically helpful.
> 
> > It seems very confusing to me to have 2 totally separate "nmi"
> > mechanisms.
> > 
> >> So I think TCG support should be a separate patch by itself.
> > 
> > Even if we don't wire up the monitor command, I still don't see
> > anything that this patch breaks - we can support the nmi-register and
> > nmi-interlock calls without ever actually creating MCE events.
> 
> If we support nmi-register and nmi-interlock calls without the monitor
> command wire-up then we will be falsely claiming the nmi support to the
> guest while it is not actually supported.

How so?  AFAICT, from the point of view of the guest this is not
observably different from supporting the NMI mechanism but NMIs never
occurring.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]