qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 38/44] target/loongarch: Implement vbitsel vset


From: Richard Henderson
Subject: Re: [RFC PATCH v2 38/44] target/loongarch: Implement vbitsel vset
Date: Thu, 13 Apr 2023 12:06:57 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0

On 4/13/23 04:53, gaosong wrote:

在 2023/4/12 下午2:53, Richard Henderson 写道:

+#define SETANYEQZ(NAME, BIT, E) \
+void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t vj) \
+{                                                                   \
+    int i; \
+    bool ret = false;                                               \
+    VReg *Vj = &(env->fpr[vj].vreg); \
+                                                                    \
+    for (i = 0; i < LSX_LEN/BIT; i++) {                             \
+        ret |= (Vj->E(i) == 0);                                     \
+ } \
+    env->cf[cd & 0x7] = ret;                                        \
+}
+SETANYEQZ(vsetanyeqz_b, 8, B)
+SETANYEQZ(vsetanyeqz_h, 16, H)
+SETANYEQZ(vsetanyeqz_w, 32, W)
+SETANYEQZ(vsetanyeqz_d, 64, D)

These could be inlined, though slightly harder.
C.f. target/arm/sve_helper.c, do_match2 (your n == 0).

Do you mean an inline like trans_vseteqz_v or just an inline helper function?

I meant inline tcg code generation, instead of a call to a helper.
But even if we keep this in a helper, see do_match2 for avoiding the loop over bytes.
Ok,
e.g
#define SETANYEQZ(NAME, MO)                                  \
void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t vj) \
{                                                                 \
     int i;                                                                \
     bool ret = false; \
     VReg *Vj = &(env->fpr[vj].vreg); \
\
     ret = do_match2(0, (uint64_t)Vj->D(0), (uint64_t)Vj->D(1), MO);            
  \
     env->cf[cd & 0x7] = ret;      \
}
SETANYEQZ(vsetanyeqz_b, MO_8)
SETANYEQZ(vsetanyeqz_h, MO_16)
SETANYEQZ(vsetanyeqz_w, MO_32)
SETANYEQZ(vsetanyeqz_d, MO_64)

and
vsetanyeqz.b    $fcc5  $vr11
   v11    : {edc0004d576eef5b, ec03ec0fec03ea47}
------------------
do_match2
bits is 8
m1 is ec03ec0fec03ea47
m0 is edc0004d576eef5b
ones is 1010101
sings is 80808080
cmp1 is 0
cmp0 is edc0004d576eef5b
cmp1 is ec03ec0fec03ea47
cmp0 is 10000
cmp1 is 3000100
ret is 0

but,  the results is not correct  for vsetanyeqz.b. :-)

Well, 'ones' as printed above is only 4 bytes instead of 8, similarly 'sings'. That would certainly explain why it did not detect a zero in byte 5 of 'm0'.

Some problem with your conversion of that function?


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]