qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 6/6] migration: Block migration wh


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 6/6] migration: Block migration while handling machine check
Date: Thu, 16 May 2019 15:17:47 +0100
User-agent: Mutt/1.11.4 (2019-03-13)

* Aravinda Prasad (address@hidden) wrote:
> 
> 
> On Thursday 16 May 2019 04:24 PM, Greg Kurz wrote:
> > On Mon, 22 Apr 2019 12:33:45 +0530
> > Aravinda Prasad <address@hidden> wrote:
> > 
> >> Block VM migration requests until the machine check
> >> error handling is complete as (i) these errors are
> >> specific to the source hardware and is irrelevant on
> >> the target hardware, (ii) these errors cause data
> >> corruption and should be handled before migration.
> >>
> >> Signed-off-by: Aravinda Prasad <address@hidden>
> >> ---
> >>  hw/ppc/spapr_events.c  |   17 +++++++++++++++++
> >>  hw/ppc/spapr_rtas.c    |    4 ++++
> >>  include/hw/ppc/spapr.h |    3 +++
> >>  3 files changed, 24 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> >> index 4032db0..45b990c 100644
> >> --- a/hw/ppc/spapr_events.c
> >> +++ b/hw/ppc/spapr_events.c
> >> @@ -41,6 +41,7 @@
> >>  #include "qemu/bcd.h"
> >>  #include "hw/ppc/spapr_ovec.h"
> >>  #include <libfdt.h>
> >> +#include "migration/blocker.h"
> >>  
> >>  #define RTAS_LOG_VERSION_MASK                   0xff000000
> >>  #define   RTAS_LOG_VERSION_6                    0x06000000
> >> @@ -864,6 +865,22 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, 
> >> bool recovered)
> >>  void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
> >>  {
> >>      SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >> +    int ret;
> >> +    Error *local_err = NULL;
> >> +
> >> +    error_setg(&spapr->migration_blocker,
> >> +            "Live migration not supported during machine check handling");
> >> +    ret = migrate_add_blocker(spapr->migration_blocker, &local_err);
> > 
> > migrate_add_blocker() propagates the reason of the failure in local_err,
> > ie. because a migration is already in progress or --only-migratable was
> > passed on the QEMU command line, along with the error message passed in
> > the first argument. This means that...
> > 
> >> +    if (ret < 0) {
> >> +        /*
> >> +         * We don't want to abort and let the migration to continue. In a
> >> +         * rare case, the machine check handler will run on the target
> >> +         * hardware. Though this is not preferable, it is better than 
> >> aborting
> >> +         * the migration or killing the VM.
> >> +         */
> >> +        error_free(spapr->migration_blocker);
> >> +        fprintf(stderr, "Warning: Machine check during VM migration\n");
> > 
> > ... you should just do:
> > 
> >         error_report_err(local_err);
> > 
> > This also takes care of freeing local_err which would be leaked otherwise.
> 
> Sure. I am planning to use warn_report_err() as I don't want to abort.

I worry what the high level effect of this blocker will be.
Since failing hardware is a common reason for wanting to do a migrate
I worry that if the hardware is reporting lots of errors you might not
be able to migrate the VM to more solid hardware because of this
blocker.

Dave

> Regards,
> Aravinda
> 
> > 
> >> +    }
> >>  
> >>      while (spapr->mc_status != -1) {
> >>          /*
> >> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> >> index 997cf19..1229a0e 100644
> >> --- a/hw/ppc/spapr_rtas.c
> >> +++ b/hw/ppc/spapr_rtas.c
> >> @@ -50,6 +50,7 @@
> >>  #include "target/ppc/mmu-hash64.h"
> >>  #include "target/ppc/mmu-book3s-v3.h"
> >>  #include "kvm_ppc.h"
> >> +#include "migration/blocker.h"
> >>  
> >>  static void rtas_display_character(PowerPCCPU *cpu, SpaprMachineState 
> >> *spapr,
> >>                                     uint32_t token, uint32_t nargs,
> >> @@ -396,6 +397,9 @@ static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
> >>          spapr->mc_status = -1;
> >>          qemu_cond_signal(&spapr->mc_delivery_cond);
> >>          rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> >> +        migrate_del_blocker(spapr->migration_blocker);
> >> +        error_free(spapr->migration_blocker);
> >> +        spapr->migration_blocker = NULL;
> >>      }
> >>  }
> >>  
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 9d16ad1..dda5fd2 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -10,6 +10,7 @@
> >>  #include "hw/ppc/spapr_irq.h"
> >>  #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
> >>  #include "hw/ppc/xics.h"        /* For ICSState */
> >> +#include "qapi/error.h"
> >>  
> >>  struct SpaprVioBus;
> >>  struct SpaprPhbState;
> >> @@ -213,6 +214,8 @@ struct SpaprMachineState {
> >>      SpaprCapabilities def, eff, mig;
> >>  
> >>      unsigned gpu_numa_id;
> >> +
> >> +    Error *migration_blocker;
> >>  };
> >>  
> >>  #define H_SUCCESS         0
> >>
> >>
> > 
> 
> -- 
> Regards,
> Aravinda
> 
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]