[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 26/36] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-sa
From: |
Richard Henderson |
Subject: |
Re: [PATCH 26/36] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree |
Date: |
Thu, 30 Apr 2020 18:55:33 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 |
On 4/30/20 11:09 AM, Peter Maydell wrote:
> +static bool do_3same_qs32(DisasContext *s, arg_3same *a, NeonGenTwoOpEnvFn
> *fn)
> +{
> + /*
> + * Saturating shift operations handled elementwise 32 bits at a
> + * time which need to pass cpu_env to the helper and where the rn
> + * and rm operands are reversed from the usual do_3same() order.
> + */
Perhaps better to handle this as you did in "Convert Neon 64-bit element
3-reg-same insns", by adding a shim expander that adds env?
It would appear we can then merge
> +{
> + VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64
> + VQSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same
> +}
back into a single pattern:
void gen_gvec_srshl(unsigned vece, uint32_t rd_ofs,
uint32_t rn_ofs, uint32_t rm_ofs,
uint32_t oprsz, uint32_t maxsz)
{
static const GVecGen3 ops[4] = {
{ .fni4 = gen_helper_neon_rshl_s8 },
{ .fni4 = gen_helper_neon_rshl_s16 },
{ .fni4 = gen_helper_neon_rshl_s32 },
{ .fni8 = gen_helper_neon_rshl_s64 }
};
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
oprsz, maxsz, &ops[vece]);
}
I'm not 100% sure how best to handle the swapped operands issue. I don't think
we want to do it here in gen_gvec_srshl, because we don't have the same reverse
operand problem in the aarch64 encoding, and I'm looking forward to re-using
this generator function in aa64 and sve2.
Maybe it would be better to have
@3same .... ... . . . size:2 .... .... .... . q:1 . . .... \
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
@3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \
&3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
and swap the operands to "normal" during decode.
FWIW, over in sve.decode, I prepared for reversed operands from the start (to
handle things like SUBR), so the formats have the register names in order:
@rd_rn_rm vs @rd_rm_rn.
r~
- [PATCH 22/36] target/arm: Move gen_ function typedefs to translate.h, (continued)
- [PATCH 22/36] target/arm: Move gen_ function typedefs to translate.h, Peter Maydell, 2020/04/30
- [PATCH 23/36] target/arm: Convert Neon 64-bit element 3-reg-same insns, Peter Maydell, 2020/04/30
- [PATCH 25/36] target/arm: Convert Neon VRHADD, VHSUB, VABD 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 21/36] target/arm: Convert Neon 3-reg-same SHA to decodetree, Peter Maydell, 2020/04/30
- [PATCH 24/36] target/arm: Convert Neon VHADD 3-reg-same insns, Peter Maydell, 2020/04/30
- [PATCH 26/36] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- Re: [PATCH 26/36] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree,
Richard Henderson <=
- [PATCH 27/36] target/arm: Convert Neon VABA 3-reg-same to decodetree, Peter Maydell, 2020/04/30
- [PATCH 29/36] target/arm: Convert Neon VPADD 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 31/36] target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 30/36] target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree, Peter Maydell, 2020/04/30
- [PATCH 32/36] target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30