[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC][PATCH 4/9] buffer_is_zero: use vector optimizatio
From: |
Peter Lieven |
Subject: |
Re: [Qemu-devel] [RFC][PATCH 4/9] buffer_is_zero: use vector optimizations if possible |
Date: |
Tue, 12 Mar 2013 17:03:37 +0100 |
Am 12.03.2013 um 17:01 schrieb Eric Blake <address@hidden>:
> On 03/12/2013 09:50 AM, Peter Lieven wrote:
>> performance gain on SSE2 is approx. 20-25%. altivec
>> is not tested. performance for unsigned long arithmetic
>> is unchanged.
>>
>> Signed-off-by: Peter Lieven <address@hidden>
>> ---
>> util/cutils.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/util/cutils.c b/util/cutils.c
>> index a09d8e8..23f0cd6 100644
>> --- a/util/cutils.c
>> +++ b/util/cutils.c
>> @@ -186,6 +186,11 @@ bool buffer_is_zero(const void *buf, size_t len)
>> * latency.
>> */
>>
>> + if (((uintptr_t) buf) % sizeof(VECTYPE) == 0
>> + && len % 8*sizeof(VECTYPE) == 0) {
>
> Space around binary operators. Use CHAR_BITS instead of a magic number
> 8. Also, did you mean:
>
> len % (CHAR_BITS * sizeof(VECTYPE))
>
> instead of what you wrote as '(len % 8) * sizeof(VECTYPE)'?
the 8 is not BITS_PER_BYTE or CHAR_BITS its the number of
vectors in one loop in buffer_find_nonzero_offset(). I will add
a constant for this to make it clearer.
Peter
>
>> + return buffer_find_nonzero_offset(buf, len)==len;
>> + }
>> +
>> size_t i;
>> long d0, d1, d2, d3;
>> const long * const data = buf;
>
> --
> Eric Blake eblake redhat com +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>