[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementatio
From: |
Nikunj A Dadhania |
Subject: |
Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation |
Date: |
Thu, 29 Sep 2016 09:11:10 +0530 |
User-agent: |
Notmuch/0.21 (https://notmuchmail.org) Emacs/25.0.94.1 (x86_64-redhat-linux-gnu) |
David Gibson <address@hidden> writes:
> [ Unknown signature status ]
> On Wed, Sep 28, 2016 at 11:01:22AM +0530, Nikunj A Dadhania wrote:
>> Load 8byte at a time and manipulate.
>>
>> Big-Endian Storage
>> +-------------+-------------+-------------+-------------+
>> | 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
>> +-------------+-------------+-------------+-------------+
>>
>> Little-Endian Storage
>> +-------------+-------------+-------------+-------------+
>> | 33 22 11 00 | 77 66 55 44 | BB AA 99 88 | FF EE DD CC |
>> +-------------+-------------+-------------+-------------+
>>
>> Vector load results in:
>> +-------------+-------------+-------------+-------------+
>> | 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
>> +-------------+-------------+-------------+-------------+
>
> Ok. I'm guessing from this that implementing those GPR<->VSR
> instructions showed that the earlier versions were endian-incorrect as
> I suspected.
>
> Have you verified that this new implementation is actually faster (or
> at least no slower) on LE than the original implementation with
> individual 32-bit stores?
Result of million lxvw4x, mfvsrd/mfvsrld and print
Without patch:
==============
[tcg_test]$ time ../qemu/ppc64le-linux-user/qemu-ppc64le -cpu POWER9 le_lxvw4x
>/dev/null
real 0m2.812s
user 0m2.792s
sys 0m0.020s
[tcg_test]$
With patch:
===========
[tcg_test]$ time ../qemu/ppc64le-linux-user/qemu-ppc64le -cpu POWER9 le_lxvw4x
>/dev/null
real 0m2.801s
user 0m2.783s
sys 0m0.018s
[tcg_test]$
Not much perceivable difference, is there a better way to benchmark?
Regards
Nikunj
- Re: [Qemu-devel] [PATCH v4 2/9] target-ppc: Implement mtvsrdd instruction, (continued)
[Qemu-devel] [PATCH v4 3/9] target-ppc: Implement mtvsrws instruction, Nikunj A Dadhania, 2016/09/28
[Qemu-devel] [PATCH v4 5/9] target-ppc: improve stxvw4x implementation, Nikunj A Dadhania, 2016/09/28
[Qemu-devel] [PATCH v4 8/9] target-ppc: add lxvb16x instruction, Nikunj A Dadhania, 2016/09/28
[Qemu-devel] [PATCH v4 7/9] target-ppc: add stxvh8x instruction, Nikunj A Dadhania, 2016/09/28