[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw
From: |
gaosong |
Subject: |
Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw |
Date: |
Mon, 27 Feb 2023 17:14:00 +0800 |
User-agent: |
Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 |
在 2023/2/25 上午3:24, Richard Henderson 写道:
On 2/23/23 21:24, gaosong wrote:
I was wrong, the instruction is to sign-extend the odd or even
elements of the vector before the operation, not to sign-extend the
result.
E.g
vaddwev_h_b vd, vj, vk
vd->H[i] = SignExtend(vj->B[2i]) + SignExtend(vk->B[2i]);
vaddwev_w_h vd, vj, vk
vd->W[i] = SignExtend(vj->H[2i]) + SignExtend(vk->H[2i]);
vaddwev_d_w vd, vj, vk
vd->Q[i] = SignExtend(vj->W[2i]) + SignExtend(vk->W[2i]);
vaddwev_q_d vd, vj, vk
vd->Q[i] = SignExtend(vj->D[2i]) + SignExtend(vk->D[2i]);
Ok, good example.
Sorry , My description is not comprehensive.
vaddwedv_w_h vd, vj, vk
...
for i in range(4):
vd->W[i] = SignExtend(vj->H[2i], 32) + SignExtend(vk->H[2i]. 32);
...
static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a,
TCGv_vec b)
{
TCGv_vec t1 = tcg_temp_new_vec_matching(a);
TCGv_vec t2 = tcg_temp_new_vec_matching(b);
int halfbits = 4 << vece;
/* Sign-extend even elements from a */
tcg_gen_dupi_vec(vece, t1, MAKE_64BIT_MASK(0, halfbits));
tcg_gen_and_vec(vece, a, a, t1);
No need to mask off these bits...
I am not sure. but the result is not correct. It's weird.
like this:
the vece is MO_32.
static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
{
TCGv_vec t1 = tcg_temp_new_vec_matching(a);
TCGv_vec t2 = tcg_temp_new_vec_matching(b);
int halfbits = 4 << vece;
tcg_gen_shli_vec(vece, t1, a, halfbits);
tcg_gen_shri_vec(vece, t1, t1, halfbits);
tcg_gen_shli_vec(vece, t2, b, halfbits);
tcg_gen_shri_vec(vece, t2, t2, halfbits);
tcg_gen_add_vec(vece, t, t1, t2);
tcg_temp_free_vec(t1);
tcg_temp_free_vec(t2);
}
...
op[MO_16];
{
.fniv = gen_vaddwev_s,
.fno = gen_helper_vaddwev_w_h,
.opt_opc = vecop_list,
.vece = MO_32
},
...
TRANS(vaddwev_w_h, gvec_vvv, MO_16, gvec_vaddwev_s)
input : 0x ffff fffe ffff fffe ffff fffe ffff fffe + 0
output : 0x 0000 fffe 0000 fffe 0000 fffe 0000 fffe
the crroect is 0xffffffffefffffffefffffffefffffffe.
Thanks.
Song Gao
tcg_gen_shli_vec(vece, a, a, halfbits);
... because they shift out here anyway.
tcg_gen_sari_vec(vece, a, a, halfbits);
/* Sign-extend even elements from b */
tcg_gen_dupi_vec(vece, t2, MAKE_64BIT_MASK(0, halfbits));
tcg_gen_and_vec(vece, b, b, t2);
tcg_gen_shli_vec(vece, b, b, halfbits);
tcg_gen_sari_vec(vece, b, b, halfbits);
tcg_gen_add_vec(vece, t, a, b);
tcg_temp_free_vec(t1);
tcg_temp_free_vec(t2);
}
Otherwise this looks good.
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, gaosong, 2023/02/20
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/20
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, gaosong, 2023/02/23
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/23
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, gaosong, 2023/02/24
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/24
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw,
gaosong <=
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/27
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, gaosong, 2023/02/27
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/27
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, gaosong, 2023/02/27
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/27
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, gaosong, 2023/02/27
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/28
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/24
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, gaosong, 2023/02/28
- Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw, Richard Henderson, 2023/02/28