qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 2/8] target/ppc: rework vmrg{l, h}{b, h, w} i


From: Mark Cave-Ayland
Subject: Re: [Qemu-devel] [PATCH v4 2/8] target/ppc: rework vmrg{l, h}{b, h, w} instructions to use Vsr* macros
Date: Wed, 30 Jan 2019 18:26:59 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

On 30/01/2019 15:31, Richard Henderson wrote:

>> I'm sorry - I think I got confused by your original macro definitions which 
>> used ofs
>> as both the macro parameter and variable name, which is why I thought you 
>> wanted ofs
>> passed in by value. Based upon the above I now have this which appears to 
>> work:
>>
>>
>> #define VMRG_DO(name, element, access, sofs)                                 
>> \
>>     void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)            
>> \
>>     {                                                                        
>> \
>>         ppc_avr_t result;                                                    
>> \
>>         int ofs, half = ARRAY_SIZE(r->element) / 2;                          
>> \
>>         int i;                                                               
>> \
>>                                                                              
>> \
>>         ofs = sofs;                                                          
>> \
> 
> What is this variable and assignment for?
> What does it accomplish?

Sigh. So I thought this was required because of the "int ofs" declaration above 
which
I was under the impression came from one of your previous emails. But in fact 
it was
me that introduced this earlier before I realised this morning that in fact the
preprocessor can do all the work.

>>         for (i = 0; i < half; i++) {                                         
>> \
>>             result.access(i * 2 + 0) = a->access(i + ofs);                   
>> \
>>             result.access(i * 2 + 1) = b->access(i + ofs);                   
>> \
>>         }                                                                    
>> \
>>         *r = result;                                                         
>> \
>>     }
>>
>> #define VMRG(suffix, element, access)          \
>>     VMRG_DO(mrgl##suffix, element, access, half)   \
>>     VMRG_DO(mrgh##suffix, element, access, 0)
>> VMRG(b, u8, VsrB)
>> VMRG(h, u16, VsrH)
>> VMRG(w, u32, VsrW)
>> #undef VMRG_DO
>> #undef VMRG
>>
>>
>> Is this what you were intending? If this looks better then I'll resubmit a 
>> v5 later
>> this evening.
>>
>>> Also, I'm pretty sure it's the "low" merge that wants the "ofs" to be 
>>> non-zero,
>>> since all of the Vsr* symbols implement big-endian indexing.
>>
>> I've just tried swapping them around as you suggested, but then all my OS X
>> background fills appear with corrupted colors. So to the best of my 
>> knowledge without
>> dumping the register contents directly, the above version with low = half 
>> and high =
>> 0 still seems correct.
> 
> That is exactly what I said, still quoted above.
> It is not what you had written in the previous version.

Yes indeed, you are right that was a simple cut and paste error from where I was
testing both variants and pasted the wrong one into the email. Below is the 
hopefully
final version which I've just tested here against OS X:


#define VMRG_DO(name, element, access, ofs)                                  \
    void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)            \
    {                                                                        \
        ppc_avr_t result;                                                    \
        int i, half = ARRAY_SIZE(r->element) / 2;                            \
                                                                             \
        for (i = 0; i < half; i++) {                                         \
            result.access(i * 2 + 0) = a->access(i + ofs);                   \
            result.access(i * 2 + 1) = b->access(i + ofs);                   \
        }                                                                    \
        *r = result;                                                         \
    }

#define VMRG(suffix, element, access)          \
    VMRG_DO(mrgl##suffix, element, access, half)   \
    VMRG_DO(mrgh##suffix, element, access, 0)
VMRG(b, u8, VsrB)
VMRG(h, u16, VsrH)
VMRG(w, u32, VsrW)
#undef VMRG_DO
#undef VMRG


ATB,

Mark.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]