qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrement


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration
Date: Mon, 11 Jan 2016 15:55:19 +1100
User-agent: Mutt/1.5.24 (2015-08-30)

On Mon, Jan 11, 2016 at 12:18:31PM +1100, Alexey Kardashevskiy wrote:
> On 01/09/2016 01:21 AM, Mark Cave-Ayland wrote:
> >On 08/01/16 02:47, Alexey Kardashevskiy wrote:
> >
> >>On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:
> >>>During local testing with TCG, intermittent errors were found when
> >>>trying to
> >>>migrate Darwin OS images.
> >>>
> >>>The underlying cause was that Darwin resets the decrementer value to
> >>>fairly
> >>>small values on each interrupt. cpu_ppc_set_tb_clk() sets the default
> >>>value
> >>>of the decrementer to 0xffffffff during initialisation which typically
> >>>corresponds to several seconds. Hence when restoring the image, the guest
> >>>would effectively "lose" decrementer interrupts during this time causing
> >>>confusion in the guest.
> >>>
> >>>NOTE: there does seem to be some overlap here with the
> >>>vmstate_ppc_timebase
> >>>code, however it doesn't seem to handle multiple CPUs which is why
> >>>I've gone
> >>>for an independent implementation.
> >>
> >>It does handle multiple CPUs:
> >>
> >>static int timebase_post_load(void *opaque, int version_id)
> >>{
> >>...
> >>      /* Set new offset to all CPUs */
> >>      CPU_FOREACH(cpu) {
> >>          PowerPCCPU *pcpu = POWERPC_CPU(cpu);
> >>          pcpu->env.tb_env->tb_offset = tb_off_adj;
> >>      }
> >>
> >>
> >>It does not transfer DECR though, it transfers the offset instead, one
> >>for all CPUs.
> >>
> >>
> >>The easier solution would be just adding this instead of the whole patch:
> >>
> >>spr_register(env, SPR_DECR, "DECR",
> >>              SPR_NOACCESS, SPR_NOACCESS,
> >>              &spr_read_decr, &spr_write_decr,
> >>              0x00000000);
> >>
> >>somewhere in target-ppc/translate_init.c for CPUs you want to support,
> >>gen_tbl() seems to be the right place for this. As long as it is just
> >>spr_register() and not spr_register_kvm(), I suppose it should not break
> >>KVM and pseries.
> >
> >I've just tried adding that but it then gives the following error on
> >startup:
> >
> >Error: Trying to register SPR 22 (016) twice !
> >
> >Based upon this, the existing registration seems fine. I think the
> >problem is that I can't see anything in __cpu_ppc_store_decr() that
> >updates the spr[SPR_DECR] value when the decrementer register is
> >changed, so it needs to be explicitly added to
> >cpu_pre_save()/cpu_post_load():
> >
> >
> >diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> >index 251a84b..495e58d 100644
> >--- a/target-ppc/machine.c
> >+++ b/target-ppc/machine.c
> >@@ -141,6 +141,7 @@ static void cpu_pre_save(void *opaque)
> >      env->spr[SPR_CFAR] = env->cfar;
> >  #endif
> >      env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
> >+    env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
> >
> >      for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
> >          env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
> >@@ -175,6 +176,7 @@ static int cpu_post_load(void *opaque, int version_id)
> >      env->cfar = env->spr[SPR_CFAR];
> >  #endif
> >      env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
> >+    cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
> >
> >      for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
> >          env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];
> >
> >
> >I've just tried the diff above instead of my original version and
> >repeated my savevm/loadvm pair test with a Darwin installation and it
> >also fixes the random hang I was seeing on loadvm.
> >
> >Seemingly this should work on KVM in that cpu_ppc_load_decr() and
> >cpu_ppc_store_decr() become no-ops as long as KVM maintains
> >env->spr[SPR_DECR], but a second set of eyeballs would be useful here.
> >
> >If you can let me know if this is suitable then I'll update the patchset
> >based upon your feedback and send out a v2.
> 
> 
> Looks good to me, I'd just wait a day or two in the case if David wants to
> comment.

I was on holiday and missed the start of this thread, I'm afraid, so I
don't fully understand the context here.

Am I right in thinking that this change will essentially freeze the
decrementer across the migration downtime?  That doesn't seem right,
since the decrementer is supposed to be linked to the timebase and
represent real time passing.

In other words, isn't this just skipping the decrementer interrupts at
the qemu level rather than the guest level?

It seems that instead we should be reconstructing the decrementer on
the destination based on an offset from the timebase.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]