qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw


From: gaosong
Subject: Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw
Date: Mon, 27 Feb 2023 17:14:00 +0800
User-agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0


在 2023/2/25 上午3:24, Richard Henderson 写道:
On 2/23/23 21:24, gaosong wrote:
I was wrong, the instruction is to sign-extend the odd or even elements of the vector before the operation, not to sign-extend the result.
E.g
vaddwev_h_b  vd, vj, vk
vd->H[i] = SignExtend(vj->B[2i])  + SignExtend(vk->B[2i]);
vaddwev_w_h  vd, vj, vk
vd->W[i] = SignExtend(vj->H[2i])  + SignExtend(vk->H[2i]);
vaddwev_d_w  vd, vj, vk
vd->Q[i] = SignExtend(vj->W[2i])  + SignExtend(vk->W[2i]);
vaddwev_q_d  vd, vj, vk
vd->Q[i] = SignExtend(vj->D[2i])  + SignExtend(vk->D[2i]);

Ok, good example.

Sorry ,  My description is not comprehensive.

vaddwedv_w_h  vd, vj, vk

...

for i in range(4):
    vd->W[i] = SignExtend(vj->H[2i], 32)  + SignExtend(vk->H[2i]. 32);

...

static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
{
     TCGv_vec t1 = tcg_temp_new_vec_matching(a);
     TCGv_vec t2 = tcg_temp_new_vec_matching(b);

     int halfbits  =  4 << vece;

     /* Sign-extend even elements from a */
     tcg_gen_dupi_vec(vece, t1, MAKE_64BIT_MASK(0, halfbits));
     tcg_gen_and_vec(vece, a, a, t1);

No need to mask off these bits...

I am not sure.  but the result is not correct.   It's  weird.


like this:
the vece is MO_32.
static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
{
    TCGv_vec t1 = tcg_temp_new_vec_matching(a);
    TCGv_vec t2 = tcg_temp_new_vec_matching(b);
    int halfbits = 4 << vece;
    tcg_gen_shli_vec(vece, t1, a, halfbits);
    tcg_gen_shri_vec(vece, t1, t1, halfbits);

    tcg_gen_shli_vec(vece, t2, b,  halfbits);
    tcg_gen_shri_vec(vece, t2, t2, halfbits);

    tcg_gen_add_vec(vece, t, t1, t2);

    tcg_temp_free_vec(t1);
    tcg_temp_free_vec(t2);
}
...
       op[MO_16];
        {
            .fniv = gen_vaddwev_s,
            .fno = gen_helper_vaddwev_w_h,
            .opt_opc = vecop_list,
            .vece = MO_32
        },
...
TRANS(vaddwev_w_h, gvec_vvv, MO_16, gvec_vaddwev_s)

input :       0x ffff     fffe ffff     fffe   ffff    fffe ffff fffe  + 0
output :    0x 0000 fffe 0000 fffe  0000 fffe 0000 fffe
the crroect is  0xffffffffefffffffefffffffefffffffe.

Thanks.
Song Gao
     tcg_gen_shli_vec(vece, a, a, halfbits);

... because they shift out here anyway.

     tcg_gen_sari_vec(vece, a, a, halfbits);

     /* Sign-extend even elements from b */
     tcg_gen_dupi_vec(vece, t2, MAKE_64BIT_MASK(0, halfbits));
     tcg_gen_and_vec(vece, b, b, t2);
     tcg_gen_shli_vec(vece, b, b, halfbits);
     tcg_gen_sari_vec(vece,  b, b, halfbits);

     tcg_gen_add_vec(vece, t, a, b);

     tcg_temp_free_vec(t1);
     tcg_temp_free_vec(t2);
}

Otherwise this looks good.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]