[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [PATCH v9 6/6] migration: Include migration support for m
From: |
David Gibson |
Subject: |
Re: [Qemu-ppc] [PATCH v9 6/6] migration: Include migration support for machine check handling |
Date: |
Fri, 7 Jun 2019 10:23:32 +1000 |
User-agent: |
Mutt/1.11.4 (2019-03-13) |
On Thu, Jun 06, 2019 at 04:55:18PM +0530, Aravinda Prasad wrote:
>
>
> On Thursday 06 June 2019 08:36 AM, David Gibson wrote:
> > On Wed, May 29, 2019 at 11:10:57AM +0530, Aravinda Prasad wrote:
> >> This patch includes migration support for machine check
> >> handling. Especially this patch blocks VM migration
> >> requests until the machine check error handling is
> >> complete as (i) these errors are specific to the source
> >> hardware and is irrelevant on the target hardware,
> >> (ii) these errors cause data corruption and should
> >> be handled before migration.
> >>
> >> Signed-off-by: Aravinda Prasad <address@hidden>
> >> ---
> >> hw/ppc/spapr.c | 20 ++++++++++++++++++++
> >> hw/ppc/spapr_events.c | 17 +++++++++++++++++
> >> hw/ppc/spapr_rtas.c | 4 ++++
> >> include/hw/ppc/spapr.h | 2 ++
> >> 4 files changed, 43 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index e8a77636..31c4850 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -2104,6 +2104,25 @@ static const VMStateDescription vmstate_spapr_dtb =
> >> {
> >> },
> >> };
> >>
> >> +static bool spapr_fwnmi_needed(void *opaque)
> >> +{
> >> + SpaprMachineState *spapr = (SpaprMachineState *)opaque;
> >> +
> >> + return (spapr->guest_machine_check_addr == -1) ? 0 : 1;
> >
> > Since we're introducing a PAPR capability to enable this, it would
> > actually be better to check that here, rather than the runtime state.
> > That leads to less cases and easier to understand semantics for the
> > migration stream.
>
> I am fine with this approach as well.
>
> >
> >> +}
> >> +
> >> +static const VMStateDescription vmstate_spapr_machine_check = {
> >> + .name = "spapr_machine_check",
> >> + .version_id = 1,
> >> + .minimum_version_id = 1,
> >> + .needed = spapr_fwnmi_needed,
> >> + .fields = (VMStateField[]) {
> >> + VMSTATE_UINT64(guest_machine_check_addr, SpaprMachineState),
> >> + VMSTATE_INT32(mc_status, SpaprMachineState),
> >> + VMSTATE_END_OF_LIST()
> >> + },
> >> +};
> >> +
> >> static const VMStateDescription vmstate_spapr = {
> >> .name = "spapr",
> >> .version_id = 3,
> >> @@ -2137,6 +2156,7 @@ static const VMStateDescription vmstate_spapr = {
> >> &vmstate_spapr_dtb,
> >> &vmstate_spapr_cap_large_decr,
> >> &vmstate_spapr_cap_ccf_assist,
> >> + &vmstate_spapr_machine_check,
> >> NULL
> >> }
> >> };
> >> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> >> index 573c0b7..35e21e4 100644
> >> --- a/hw/ppc/spapr_events.c
> >> +++ b/hw/ppc/spapr_events.c
> >> @@ -41,6 +41,7 @@
> >> #include "qemu/bcd.h"
> >> #include "hw/ppc/spapr_ovec.h"
> >> #include <libfdt.h>
> >> +#include "migration/blocker.h"
> >>
> >> #define RTAS_LOG_VERSION_MASK 0xff000000
> >> #define RTAS_LOG_VERSION_6 0x06000000
> >> @@ -855,6 +856,22 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu,
> >> bool recovered)
> >> void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
> >> {
> >> SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >> + int ret;
> >> + Error *local_err = NULL;
> >> +
> >> + error_setg(&spapr->fwnmi_migration_blocker,
> >> + "Live migration not supported during machine check handling");
> >> + ret = migrate_add_blocker(spapr->fwnmi_migration_blocker, &local_err);
> >> + if (ret < 0) {
> >> + /*
> >> + * We don't want to abort and let the migration to continue. In a
> >> + * rare case, the machine check handler will run on the target
> >> + * hardware. Though this is not preferable, it is better than
> >> aborting
> >> + * the migration or killing the VM.
> >> + */
> >> + error_free(spapr->fwnmi_migration_blocker);
> >
> > You should set fwnmi_migration_blocker to NULL here as well.
>
> ok.
>
> >
> > As mentioned on an earlier iteration, the migration blocker is the
> > same every time. Couldn't you just create it once and free at final
> > teardown, rather than recreating it for every NMI?
>
> That means, we create the error string at the time when ibm,nmi-register
> is invoked and tear it down during machine reset?
Or you could even just create it at machine_init time, and tear it
down never, just add/remove it from the blocker slot.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
- Re: [Qemu-ppc] [PATCH v9 6/6] migration: Include migration support for machine check handling, (continued)
Re: [Qemu-ppc] [PATCH v9 6/6] migration: Include migration support for machine check handling, Aravinda Prasad, 2019/06/06