qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [PATCH 5/6] target-ppc: add lxvb16x and lxvh


From: Richard Henderson
Subject: Re: [Qemu-ppc] [Qemu-devel] [PATCH 5/6] target-ppc: add lxvb16x and lxvh8x
Date: Mon, 8 Aug 2016 12:05:21 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

On 08/08/2016 10:57 AM, Richard Henderson wrote:
On 08/07/2016 11:06 PM, Nikunj A Dadhania wrote:
+#define LXV(name, access, swap, type, elems)                     \
+uint64_t helper_##name(CPUPPCState *env,                         \
+                       target_ulong addr)                        \
+{                                                                \
+    type r[elems] = {0};                                         \
+    int i, index, bound, step;                                   \
+    if (msr_le) {                                                \
+        index = elems - 1;                                       \
+        bound = -1;                                              \
+        step = -1;                                               \
+    } else  {                                                    \
+        index = 0;                                               \
+        bound = elems;                                           \
+        step = 1;                                                \
+    }                                                            \
+                                                                 \
+    for (i = index; i != bound; i += step) {                     \
+        if (needs_byteswap(env)) {                               \
+            r[i] = swap(access(env, addr, GETPC()));             \
+        } else {                                                 \
+            r[i] =  access(env, addr, GETPC());                  \
+        }                                                        \
+        addr = addr_add(env, addr, sizeof(type));                \
+    }                                                            \
+    return  *((uint64_t *)r);                                    \
+}

This looks more complicated than necessary.

(1) In big-endian mode, surely this simplifies to two 64-bit big-endian loads.

(2) In little-endian mode, the overhead of accessing memory surely dominates,
and therefore we should perform two 64-bit loads and manipulate the data after.

AFAICS, this is easiest done by requesting two 64-bit *big-endian* loads, and
then swapping bytes.  E.g.

uint64_t helper_bswap16x4(uint64_t x)
{
    uint64_t m = 0x00ff00ff00ff00ffull;
    return ((x & m) << 8) | ((x >> 8) & m);
}

uint64_t helper_bswap32x2(uint64_t x)
{
    return deposit64(bswap32(x >> 32), 32, 32, bswap32(x));
}

To correct myself, this big-endian load really only makes sense for lxvh8x.
For lxvw4x, a little-endian load with a word swap is fewer operations.  I.e.

  tcg_gen_qemu_ld_i64(t0, addr, ctx->mem_idx, MO_LEQ);
  tcg_gen_shri_i64(t1, t0, 32);
  tcg_gen_deposit_i64(dest, t1, t0, 32, 32);


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]