[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PULL 06/10] ppc/spapr: Don't kill the guest if a recovered FWNMI machin
From: |
David Gibson |
Subject: |
[PULL 06/10] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails |
Date: |
Tue, 7 Apr 2020 14:36:02 +1000 |
From: Nicholas Piggin <address@hidden>
Try to be tolerant of FWNMI delivery errors if the machine check had been
recovered by the host.
Signed-off-by: Nicholas Piggin <address@hidden>
Message-Id: <address@hidden>
Reviewed-by: Greg Kurz <address@hidden>
[dwg: Updated comment at Greg's suggestion]
Signed-off-by: David Gibson <address@hidden>
---
hw/ppc/spapr_events.c | 30 +++++++++++++++++++++++++-----
1 file changed, 25 insertions(+), 5 deletions(-)
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index c8964eb25d..1069d0197b 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -833,13 +833,28 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool
recovered)
/* get rtas addr from fdt */
rtas_addr = spapr_get_rtas_addr();
if (!rtas_addr) {
- error_report(
+ if (!recovered) {
+ error_report(
"FWNMI: Unable to deliver machine check to guest: rtas_addr not found.");
- qemu_system_guest_panicked(NULL);
+ qemu_system_guest_panicked(NULL);
+ } else {
+ warn_report(
+"FWNMI: Unable to deliver machine check to guest: rtas_addr not found. "
+"Machine check recovered.");
+ }
g_free(ext_elog);
return;
}
+ /*
+ * By taking the interlock, we assume that the MCE will be
+ * delivered to the guest. CAUTION: don't add anything that could
+ * prevent the MCE to be delivered after this line, otherwise the
+ * guest won't be able to release the interlock and ultimately
+ * hang/crash?
+ */
+ spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
+
stq_be_phys(&address_space_memory, rtas_addr + RTAS_ERROR_LOG_OFFSET,
env->gpr[3]);
cpu_physical_memory_write(rtas_addr + RTAS_ERROR_LOG_OFFSET +
@@ -876,9 +891,15 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
* that CPU called "ibm,nmi-interlock")
*/
if (spapr->fwnmi_machine_check_interlock == cpu->vcpu_id) {
- error_report(
+ if (!recovered) {
+ error_report(
"FWNMI: Unable to deliver machine check to guest: nested machine check.");
- qemu_system_guest_panicked(NULL);
+ qemu_system_guest_panicked(NULL);
+ } else {
+ warn_report(
+"FWNMI: Unable to deliver machine check to guest: nested machine check. "
+"Machine check recovered.");
+ }
return;
}
qemu_cond_wait_iothread(&spapr->fwnmi_machine_check_interlock_cond);
@@ -906,7 +927,6 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
warn_report("Received a fwnmi while migration was in progress");
}
- spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
spapr_mce_dispatch_elog(cpu, recovered);
}
--
2.25.2
- [PULL 00/10] ppc-for-5.0 queue 20200407, David Gibson, 2020/04/07
- [PULL 01/10] hw/ppc/e500.c: Handle qemu_find_file() failure, David Gibson, 2020/04/07
- [PULL 03/10] ppc/spapr: KVM FWNMI should not be enabled until guest requests it, David Gibson, 2020/04/07
- [PULL 02/10] vfio/spapr: Fix page size calculation, David Gibson, 2020/04/07
- [PULL 04/10] ppc/spapr: Improve FWNMI machine check delivery corner case comments, David Gibson, 2020/04/07
- [PULL 08/10] hw/ppc/ppc440_uc.c: Remove incorrect iothread locking from dcr_write_pcie(), David Gibson, 2020/04/07
- [PULL 05/10] ppc/spapr: Add FWNMI machine check delivery warnings, David Gibson, 2020/04/07
- [PULL 06/10] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails,
David Gibson <=
- [PULL 10/10] ppc/pnv: Create BMC devices only when defaults are enabled, David Gibson, 2020/04/07
- [PULL 09/10] pseries: Update SLOF firmware image, David Gibson, 2020/04/07
- [PULL 07/10] spapr: Fix failure path for attempting to hot unplug PCI bridges, David Gibson, 2020/04/07
- Re: [PULL 00/10] ppc-for-5.0 queue 20200407, Peter Maydell, 2020/04/07