qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 57/60] target/riscv: vector slide instructions


From: LIU Zhiwei
Subject: Re: [PATCH v5 57/60] target/riscv: vector slide instructions
Date: Tue, 24 Mar 2020 18:51:25 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0



On 2020/3/17 1:42, Richard Henderson wrote:
On 3/16/20 1:04 AM, LIU Zhiwei wrote:
As a preference, I think you can do away with this helper.
Simply use the slideup helper with argument 1, and then
afterwards store the integer register into element 0.  You should be able to
re-use code from vmv.s.x for that.
When I try it, I find it is some difficult, because  vmv.s.x will clean
the elements (0 < index < VLEN/SEW).
Well, two things about that:

(1) The 0.8 version of vmv.s.x does *not* zero the other elements, so we'll
want to be prepared for that.

(2) We have 8 insns that, in the end come down to a direct element access,
possibly with some other processing.

So we'll want basic helper functions that can locate an element by immediate
offset and by variable offset:

/* Compute the offset of vreg[idx] relative to cpu_env.
    The index must be in range of VLMAX. */
int vec_element_ofsi(int vreg, int idx, int sew);

/* Compute a pointer to vreg[idx].
    If need_bound is true, mask idx into VLMAX,
    Otherwise we know a-priori that idx is already in bounds. */
void vec_element_ofsx(DisasContext *s, TCGv_ptr base,
                       TCGv idx, int sew, bool need_bound);

/* Load idx >= VLMAX ? 0 : vreg[idx] */
void vec_element_loadi(DisasContext *s, TCGv_i64 val,
                        int vreg, int idx, int sew);
void vec_element_loadx(DisasContext *s, TCGv_i64 val,
                        int vreg, TCGv idx, int sew);

/* Store vreg[imm] = val.
    The index must be in range of VLMAX.  */
void vec_element_storei(DisasContext *s, int vreg, int imm,
                         TCGv_i64 val);
void vec_element_storex(DisasContext *s, int vreg,
                         TCGv idx, TCGv_i64 val);

(3) It would be handy to have TCGv cpu_vl.
Do you mean I should define cpu_vl as a global TCG varible like cpu_pc?
So that I can check vl==0 in translation time.

Or just a temp variable?

Then:

vext.x.v:
     If rs1 == 0,
         Use vec_element_loadi(s, x[rd], vs2, 0, s->sew).
     else
         Use vec_element_loadx(s, x[rd], vs2, x[rs1], true).

vmv.s.x:
     over = gen_new_label();
     tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
     For 0.7.1:
         Use tcg_gen_dup8i to zero all VLMAX elements of vd.
         If rs1 == 0, goto done.
     Use vec_element_storei(s, vs2, 0, x[rs1]).
  done:
     gen_set_label(over);

vfmv.f.s:
     Use vec_element_loadi(x, f[rd], vs2, 0).
     NaN-box f[rd] as necessary for SEW.

vfmv.s.f:
     tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
     For 0.7.1:
         Use tcg_gen_dup8i to zero all VLMAX elements of vd.
     Let tmp = f[rs1], nan-boxed as necessary for SEW.
     Use vec_element_storei(s, vs2, 0, tmp).
     gen_set_label(over);

vslide1up.vx:
     Ho hum, I forgot about masking.  Some options:
     (1) Call a helper just as you did in your original patch.
     (2) Call a helper only for !vm, for vm as below.

Sorry, I don't get it why I need a helper for !vm.
I think I can  call vslideup w/1 whether !vm or vm, then a store to vd[0].

Zhiwei
     (3) Call vslideup w/1.
         tcg_gen_brcondi(TCG_COND_EQ, cpu_vl, 0, over);
         If !vm,
             // inline test for v0[0]
             vec_element_loadi(s, tmp, 0, 0, MO_8);
             tcg_gen_andi_i64(tmp, tmp, 1);
             tcg_gen_brcondi(TCG_COND_EQ, tmp, 0, over);
         Use vec_element_store(s, vd, 0, x[rs1]).
         gen_set_label(over);

vslide1down.vx:
     For !vm, this is complicated enough for a helper.
     If using option 3 for vslide1up, then the store becomes:
     tcg_gen_subi_tl(tmp, cpu_vl, 1);
     vec_element_storex(s, base, tmp, x[rs1]);

vrgather.vx:
     If !vm or !vl_eq_vlmax, use helper.
     vec_element_loadx(s, tmp, vs2, x[rs1]);
     Use tcg_gen_gvec_dup_i64 to store to tmp to vd.

vrgather.vi:
     If !vm or !vl_eq_vlmax, use helper.
     If imm >= vlmax,
         Use tcg_gen_dup8i to zero vd;
     else,
         ofs = vec_element_ofsi(s, vs2, imm, s->sew);
         tcg_gen_gvec_dup_mem(sew, vreg_ofs(vd),
                              ofs, vlmax, vlmax);


r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]