[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 6/9] target-ppc: add lxvh8x instruction
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [PATCH v4 6/9] target-ppc: add lxvh8x instruction |
Date: |
Wed, 28 Sep 2016 10:22:30 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 |
On 09/28/2016 10:11 AM, Nikunj A Dadhania wrote:
> Richard Henderson <address@hidden> writes:
>
>> On 09/27/2016 10:31 PM, Nikunj A Dadhania wrote:
>>> +DEF_HELPER_1(bswap16x4, i64, i64)
>>
>> DEF_HELPER_FLAGS_1(bswap16x4, TCG_CALL_NO_RWG_SE, i64, i64)
>>
>>> + uint64_t m = 0x00ff00ff00ff00ffull;
>>> + return ((x & m) << 8) | ((x >> 8) & m);
>>
>> ... although I suppose this is only 5 instructions, and could reasonably be
>> done inline too. Especially if you shared the one 64-bit constant across the
>> two bswaps.
>
> Something like this:
>
> static void gen_bswap16x4(TCGv_i64 val)
> {
> TCGv_i64 mask = tcg_const_i64(0x00FF00FF00FF00FF);
> TCGv_i64 t0 = tcg_temp_new_i64();
> TCGv_i64 t1 = tcg_temp_new_i64();
>
> /* val = ((val & mask) << 8) | ((val >> 8) & mask) */
> tcg_gen_and_i64(t0, val, mask);
> tcg_gen_shri_i64(t0, t0, 8);
> tcg_gen_shli_i64(t1, val, 8);
> tcg_gen_and_i64(t1, t1, mask);
> tcg_gen_or_i64(val, t0, t1);
>
> tcg_temp_free_i64(t0);
> tcg_temp_free_i64(t1);
> tcg_temp_free_i64(mask);
> }
Like that, except that since you always perform this twice, you should share
the expensive constant load. Recall also that you need temporaries for the
store, so
static void gen_bswap16x8(TCGv_i64 outh, TCGv_i64 outl,
TCGv_i64 inh, TCGv_i64 inl)
r~