qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] atomic failures on qemu-system-riscv64


From: Alistair Francis
Subject: Re: [Qemu-devel] [PATCH] atomic failures on qemu-system-riscv64
Date: Tue, 24 Sep 2019 18:01:36 -0700

On Tue, Sep 24, 2019 at 4:35 PM Alistair Francis <address@hidden> wrote:
>
> On Tue, Sep 24, 2019 at 1:04 PM Palmer Dabbelt <address@hidden> wrote:
> >
> > On Tue, 24 Sep 2019 11:29:25 PDT (-0700), address@hidden wrote:
> > > On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <address@hidden> wrote:
> > >>
> > >> On 19-06-17 16:52:44, Richard Henderson wrote:
> > >> > On 6/16/19 12:19 PM, Joel Sing wrote:
> > >> > > +    /*
> > >> > > +     * Clear the load reservation, since an SC must fail if there is
> > >> > > +     * an SC to any address, in between an LR and SC pair.
> > >> > > +     */
> > >> > > +    tcg_gen_movi_tl(load_res, 0);
> > >> > > +
> > >> > >      gen_set_label(l2);
> > >> >
> > >> > This clear needs to be moved down below label l2.
> > >> > Otherwise, with lr / sc / sc, the second sc could succeed in error.
> > >>
> > >> Indeed, thanks.
> > >>
> > >> > FWIW, other targets have used -1 as the "invalid" load reservation, 
> > >> > since the
> > >> > architecture does not require address 0 to be unmapped.  This should 
> > >> > be quite
> > >> > visible in M-mode with paging disabled and ram at offset 0.  Often, 
> > >> > other
> > >> > targets require alignment for the lr/sc address, though I don't see 
> > >> > that for riscv.
> > >>
> > >> I've switched to -1 as suggested. Regarding the alignment for 
> > >> reservations, the
> > >> specification does require this, although I do not recall seeing any 
> > >> enforcement
> > >> of this by qemu itself.
> > >>
> > >> New diff follows.
> > >>
> > >> From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001
> > >> From: Joel Sing <address@hidden>
> > >> Date: Tue, 25 Jun 2019 02:44:24 +1000
> > >> Subject: [PATCH] Clear load reservations on qemu riscv target
> > >>
> > >> This prevents a load reservation from being placed in one 
> > >> context/process,
> > >> then being used in another, resulting in an SC succeeding incorrectly and
> > >> breaking atomics.
> > >>
> > >> Signed-off-by: Joel Sing <address@hidden>
> > >> ---
> > >>  target/riscv/cpu.c                      | 1 +
> > >>  target/riscv/cpu_helper.c               | 9 +++++++++
> > >>  target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++-
> > >>  3 files changed, 17 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > >> index d61bce6d55..e7c8bf48fc 100644
> > >> --- a/target/riscv/cpu.c
> > >> +++ b/target/riscv/cpu.c
> > >> @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs)
> > >>      env->pc = env->resetvec;
> > >>  #endif
> > >>      cs->exception_index = EXCP_NONE;
> > >> +    env->load_res = -1;
> > >>      set_default_nan_mode(1, &env->fp_status);
> > >>  }
> > >>
> > >> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > >> index b17f169681..6a07b12e65 100644
> > >> --- a/target/riscv/cpu_helper.c
> > >> +++ b/target/riscv/cpu_helper.c
> > >> @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, 
> > >> target_ulong newpriv)
> > >>      }
> > >>      /* tlb_flush is unnecessary as mode is contained in mmu_idx */
> > >>      env->priv = newpriv;
> > >> +
> > >> +    /* Clear the load reservation - otherwise a reservation placed in 
> > >> one
> > >> +     * context/process can be used by another, resulting in an SC 
> > >> succeeding
> > >> +     * incorrectly. Version 2.2 of the ISA specification explicitly 
> > >> requires
> > >> +     * this behaviour, while later revisions say that the kernel 
> > >> "should" use
> > >> +     * an SC instruction to force the yielding of a load reservation on 
> > >> a
> > >> +     * preemptive context switch. As a result, do both.
> > >> +     */
> > >> +    env->load_res = -1;
> > >>  }
> > >>
> > >>  /* get_physical_address - get the physical address for this virtual 
> > >> address
> > >> diff --git a/target/riscv/insn_trans/trans_rva.inc.c 
> > >> b/target/riscv/insn_trans/trans_rva.inc.c
> > >> index f6dbbc065e..fadd88849e 100644
> > >> --- a/target/riscv/insn_trans/trans_rva.inc.c
> > >> +++ b/target/riscv/insn_trans/trans_rva.inc.c
> > >> @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, 
> > >> arg_atomic *a, TCGMemOp mop)
> > >>
> > >>      gen_set_label(l1);
> > >>      /*
> > >> -     * Address comparion failure.  However, we still need to
> > >> +     * Address comparison failure.  However, we still need to
> > >>       * provide the memory barrier implied by AQ/RL.
> > >>       */
> > >>      tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * 
> > >> TCG_BAR_STRL);
> > >> @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, 
> > >> arg_atomic *a, TCGMemOp mop)
> > >>      gen_set_gpr(a->rd, dat);
> > >>
> > >>      gen_set_label(l2);
> > >> +    /*
> > >> +     * Clear the load reservation, since an SC must fail if there is
> > >> +     * an SC to any address, in between an LR and SC pair.
> > >> +     */
> > >> +    tcg_gen_movi_tl(load_res, -1);
> > >> +
> > >>      tcg_temp_free(dat);
> > >>      tcg_temp_free(src1);
> > >>      tcg_temp_free(src2);
> > >> --
> > >
> > > This patch causes boot failures when booting systemd built with musl on 
> > > RV64.
> > >
> > > It could possibly be a musl bug, but I wanted to throw that out here
> > > first to see what people think.
> >
> > Looking at the musl port, I see at least one bug in their atomics jumping 
> > out
> > at me:
> >
> > diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h
> > index c9765342..41ad4d04 100644
> > --- a/arch/riscv64/atomic_arch.h
> > +++ b/arch/riscv64/atomic_arch.h
> > @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s)
> >                 "       sc.w.aqrl %1, %4, (%2)\n"
> >                 "       bnez %1, 1b\n"
> >                 "1:"
> > -               : "=&r"(old), "=r"(tmp)
> > +               : "=&r"(old), "=&r"(tmp)
> >                 : "r"(p), "r"(t), "r"(s)
> >                 : "memory");
> >         return old;
> > @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, 
> > void *s)
> >                 "       sc.d.aqrl %1, %4, (%2)\n"
> >                 "       bnez %1, 1b\n"
> >                 "1:"
> > -               : "=&r"(old), "=r"(tmp)
> > +               : "=&r"(old), "=&r"(tmp)
> >                 : "r"(p), "r"(t), "r"(s)
> >                 : "memory");
> >         return old;
> >
> > It's a shot in the dark as to whether that'll fix your bug, but I could at
> > least see a mechanism for it: before we yielded load reservations on context
> > switches then that backwards branch would never be taken, so we wouldn't 
> > notice
> > if tmp was allocated into one of the same registers as the inputs.  Even if 
> > it
> > doesn't fix your issue it's still a bug so I'll send the patch out, just 
> > LMK so
> > I can indicate how important the issue is.
>
> I haven't had a chance to test this fix yet. The bug was reported by
> Khem (and other OE people) as it's break musl for RISC-V.

I did get a chance to test it. This seems to fix the issue :)

Please send the patch to musl and CC me when you do.

Good catch!

Alistair

>
> >
> > This should manifest on hardware, but it looks like we managed to drop that 
> > SC
> > patch.  I'll go send the patch out now...
>
> Thanks, do you mind CCing me?
>
> Alistair
>
> >
> > >
> > > Alistair
> > >
> > >> 2.21.0
> > >>
> > >>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]