qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw


From: Richard Henderson
Subject: Re: [RFC PATCH 10/43] target/loongarch: Implement vaddw/vsubw
Date: Fri, 24 Feb 2023 09:24:21 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1

On 2/23/23 21:24, gaosong wrote:
I was wrong, the instruction is to sign-extend the odd or even elements of the vector before the operation, not to sign-extend the result.
E.g
vaddwev_h_b  vd, vj, vk
vd->H[i] = SignExtend(vj->B[2i])  + SignExtend(vk->B[2i]);
vaddwev_w_h  vd, vj, vk
vd->W[i] = SignExtend(vj->H[2i])  + SignExtend(vk->H[2i]);
vaddwev_d_w  vd, vj, vk
vd->Q[i] = SignExtend(vj->W[2i])  + SignExtend(vk->W[2i]);
vaddwev_q_d  vd, vj, vk
vd->Q[i] = SignExtend(vj->D[2i])  + SignExtend(vk->D[2i]);

Ok, good example.

static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
{
     TCGv_vec t1 = tcg_temp_new_vec_matching(a);
     TCGv_vec t2 = tcg_temp_new_vec_matching(b);

     int halfbits  =  4 << vece;

     /* Sign-extend even elements from a */
     tcg_gen_dupi_vec(vece, t1, MAKE_64BIT_MASK(0, halfbits));
     tcg_gen_and_vec(vece, a, a, t1);

No need to mask off these bits...

     tcg_gen_shli_vec(vece, a, a, halfbits);

... because they shift out here anyway.

     tcg_gen_sari_vec(vece, a, a, halfbits);

     /* Sign-extend even elements from b */
     tcg_gen_dupi_vec(vece, t2, MAKE_64BIT_MASK(0, halfbits));
     tcg_gen_and_vec(vece, b, b, t2);
     tcg_gen_shli_vec(vece, b, b, halfbits);
     tcg_gen_sari_vec(vece,  b, b, halfbits);

     tcg_gen_add_vec(vece, t, a, b);

     tcg_temp_free_vec(t1);
     tcg_temp_free_vec(t2);
}

Otherwise this looks good.

         {
             .fniv = gen_vaddwev_s,
             .fno = gen_helper_vaddwev_q_d,
             .opt_opc = vecop_list,
             .vece = MO_128
         },

There are no 128-bit vector operations; you'll need to do this one differently.

Presumably just load the two 64-bit elements, sign-extend into 128-bits, add with tcg_gen_add2_i64, and store the two 64-bit elements as output. But that won't fit into the tcg_gen_gvec_3 interface.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]