qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/7] FWNMI fixes / changes


From: Greg Kurz
Subject: Re: [PATCH 0/7] FWNMI fixes / changes
Date: Thu, 5 Mar 2020 11:19:11 +0100

On Thu,  5 Mar 2020 14:59:31 +1000
Nicholas Piggin <address@hidden> wrote:

> Hi,
> 

Hi Nick,

I'm afraid you haven't Cc'd qemu-devel which is mandatory for patch
submission, while qemu-ppc is barely a tag for mail filters.

Cheers,

--
Greg

> Since RFC, I addressed review comments (thank you), I've also dropped
> the MCE changes for now just to concentrate on the basics.
> 
> The major problem is PAPR requires system reset to also be delivered
> to the address registered by nmi-register when FWNMI is enabled.
> 
> This series first allows FWNMI to work on TCG, because sresets can be
> injected under TCG (unlike MCEs) so it becomes important. Then it adds
> sreset to the migration state. Then it wire up sreset delivery to the
> sreset fwnmi address.
> 
> One remaining issue we have now is that Linux and IBM's proprietary
> hypervisor seem to deviate from PAPR when delivering fwnmi sreset
> interrupts. PAPR says all registers should be set as at interrupt
> time, as opposed to MCE delivery where r3 is specified to contain the
> address of memory which contains the original r3 and an error record.
> 
> PowerVM delivers fwnmi sreset with this r3 scheme, and Linux tries to
> restore r3 from that address. Linux also calls "ibm,nmi-interlock" after
> loading r3 from that address, whereas PAPR does not require it. PowerVM
> seems to ignore the call and doesn't synchronize anything.
> 
> I've asked PowerVM developers about this and waiting to see what comes
> of that before going further here. This is still broken on Linux at
> worst we can corrupt r3 with a random value from the rtas blob, at
> best we get:
> 
> [   43.148121] FWNMI: corrupt r3 0x0000000000000000
> [   43.148231] FWNMI: nmi-interlock failed: -3
> [   43.148286] FWNMI: corrupt r3 0x0000000000000000
> [   43.148296] FWNMI: nmi-interlock failed: -3
> 
> when handling sresets. However that problem is not introduced by this
> patch, because the Linux kernel fwnmi and 0x100 sreset entry points are
> identical, and this bug happens inside a common handler in an
> 'if (fwnmi_active)' branch.
> 
> We can probably provide this compatible behaviour without being totally
> broken with respect to sreset on multiple CPUs concurrently or nesting
> inside/outside of a machine check handler, if we provide a small
> per-cpu area for the r3 save. That would require an RTAS/SLOF change to
> allocate more space, so I haven't implemented that yet.
> 
> Thanks,
> Nick
> 
> 
> Nicholas Piggin (7):
>   ppc/spapr: Fix FWNMI machine check failure handling
>   ppc/spapr: Change FWNMI names
>   ppc/spapr: Add FWNMI System Reset state
>   ppc/spapr: Fix FWNMI machine check interrupt delivery
>   ppc/spapr: Allow FWNMI on TCG
>   target/ppc: allow ppc_cpu_do_system_reset to take an alternate vector
>   ppc/spapr: Implement FWNMI System Reset delivery
> 
>  hw/ppc/spapr.c                    | 35 ++++++++------
>  hw/ppc/spapr_caps.c               | 19 ++++----
>  hw/ppc/spapr_events.c             | 36 ++++----------
>  hw/ppc/spapr_rtas.c               | 29 ++++++++----
>  include/hw/ppc/spapr.h            | 28 +++++++----
>  target/ppc/cpu.h                  |  3 +-
>  target/ppc/excp_helper.c          | 79 ++++++++++++++++++++++---------
>  tests/qtest/libqos/libqos-spapr.h |  2 +-
>  8 files changed, 135 insertions(+), 96 deletions(-)
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]