[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v2 29/76] target/riscv: rvv-0.9: take fractional LMUL into vect
From: |
Richard Henderson |
Subject: |
Re: [RFC v2 29/76] target/riscv: rvv-0.9: take fractional LMUL into vector max elements calculation |
Date: |
Thu, 30 Jul 2020 05:52:58 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
On 7/22/20 2:15 AM, frank.chang@sifive.com wrote:
> -/*
> - * A simplification for VLMAX
> - * = (1 << LMUL) * VLEN / (8 * (1 << SEW))
> - * = (VLEN << LMUL) / (8 << SEW)
> - * = (VLEN << LMUL) >> (SEW + 3)
> - * = VLEN >> (SEW + 3 - LMUL)
> - */
> static inline uint32_t vext_get_vlmax(RISCVCPU *cpu, target_ulong vtype)
> {
> uint8_t sew, lmul;
> -
> sew = FIELD_EX64(vtype, VTYPE, VSEW);
> - lmul = FIELD_EX64(vtype, VTYPE, VLMUL);
> - return cpu->cfg.vlen >> (sew + 3 - lmul);
> + lmul = (FIELD_EX64(vtype, VTYPE, VFLMUL) << 2)
> + | FIELD_EX64(vtype, VTYPE, VLMUL);
> + float flmul = flmul_table[lmul];
> + return cpu->cfg.vlen * flmul / (1 << (sew + 3));
> }
I think if you encode lmul differently, the original formulation can still work.
E.g. LMUL = 1 -> lmul = 0
LMUL = 2 -> lmul = 1
LMUL = 1/2 -> lmul = -1
so that, for SEW=8 and LMUL=1/2 we get
cfg.vlen >> (0 + 3 - (-1))
= cfg.vlen >> (0 + 3 + 1)
= cfg.vlen >> 4
Which neatly avoids the floating-point calculation that I don't like.
r~
- [RFC v2 22/76] target/riscv: rvv-0.9: stride load and store instructions, (continued)
- [RFC v2 22/76] target/riscv: rvv-0.9: stride load and store instructions, frank . chang, 2020/07/22
- [RFC v2 23/76] target/riscv: rvv-0.9: index load and store instructions, frank . chang, 2020/07/22
- [RFC v2 24/76] target/riscv: rvv-0.9: fix address index overflow bug of indexed load/store insns, frank . chang, 2020/07/22
- [RFC v2 25/76] target/riscv: rvv-0.9: fault-only-first unit stride load, frank . chang, 2020/07/22
- [RFC v2 26/76] target/riscv: rvv-0.9: amo operations, frank . chang, 2020/07/22
- [RFC v2 27/76] target/riscv: rvv-0.9: load/store whole register instructions, frank . chang, 2020/07/22
- [RFC v2 28/76] target/riscv: rvv-0.9: update vext_max_elems() for load/store insns, frank . chang, 2020/07/22
- [RFC v2 29/76] target/riscv: rvv-0.9: take fractional LMUL into vector max elements calculation, frank . chang, 2020/07/22
- Re: [RFC v2 29/76] target/riscv: rvv-0.9: take fractional LMUL into vector max elements calculation,
Richard Henderson <=
- [RFC v2 30/76] target/riscv: rvv-0.9: floating-point square-root instruction, frank . chang, 2020/07/22
- [RFC v2 31/76] target/riscv: rvv-0.9: floating-point classify instructions, frank . chang, 2020/07/22
- [RFC v2 32/76] target/riscv: rvv-0.9: mask population count instruction, frank . chang, 2020/07/22
- [RFC v2 33/76] target/riscv: rvv-0.9: find-first-set mask bit instruction, frank . chang, 2020/07/22
- [RFC v2 34/76] target/riscv: rvv-0.9: set-X-first mask bit instructions, frank . chang, 2020/07/22