qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [PATCH 00/34] tcg, target/ppc vector improve


From: Richard Henderson
Subject: Re: [Qemu-ppc] [Qemu-devel] [PATCH 00/34] tcg, target/ppc vector improvements
Date: Tue, 18 Dec 2018 08:16:49 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1

On 12/18/18 7:26 AM, Mark Cave-Ayland wrote:
> That seems wrong to me. Given that the ppc_avr_t is a union then I'd expect 
> it to be
> in host order? Certainly in the VMX helper macros I've looked at, the members 
> are set
> directly with no byte swapping.

"Host order"?  For both words of the vector?

That's certainly going to cause problems wrt VSX and FPU registers.  We're
hard-coding that as fpu == vsx.u64[0] (both before and after your patch set).

For vscr, on master we have

void helper_mtvscr(CPUPPCState *env, ppc_avr_t *r)
{
#if defined(HOST_WORDS_BIGENDIAN)
    env->vscr = r->u32[3];
#else
    env->vscr = r->u32[0];
#endif

and

        if (needs_byteswap) {
            vmxregset->avr[i].u64[0] = bswap64(cpu->env.avr[i].u64[1]);
            vmxregset->avr[i].u64[1] = bswap64(cpu->env.avr[i].u64[0]);
        } else {
            vmxregset->avr[i].u64[0] = cpu->env.avr[i].u64[0];
            vmxregset->avr[i].u64[1] = cpu->env.avr[i].u64[1];
        }
    }
    vmxregset->vscr.u32[3] = cpu_to_dump32(s, cpu->env.vscr);

For helper macros that apply the same operation to all lanes, it doesn't matter
which order in which the lanes are processed, so of course I would expect them
to be processed in host order.

It's cases that do not apply the same operation, such as merges, where the
problems would arise.

There are at least 3 schemes being employed to address this:

#if defined(HOST_WORDS_BIGENDIAN)
#define HI_IDX 0
#define LO_IDX 1
#define AVRB(i) u8[i]
#define AVRW(i) u32[i]
#else
#define HI_IDX 1
#define LO_IDX 0
#define AVRB(i) u8[15-(i)]
#define AVRW(i) u32[3-(i)]
#endif

...
#if defined(HOST_WORDS_BIGENDIAN)
#define EL_IDX(i) (i)
#else
#define EL_IDX(i) (3 - (i))
#endif

...
#define EL_IDX(i) (i)
#else
#define EL_IDX(i) (1 - (i))
#endif

...
#if defined(HOST_WORDS_BIGENDIAN)
        result.u8[i] = a->u8[indexA] ^ b->u8[indexB];
#else
        result.u8[i] = a->u8[15-indexA] ^ b->u8[15-indexB];
#endif


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]