qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 10/10] cutils: Rewrite x86 buffer zero checking


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 10/10] cutils: Rewrite x86 buffer zero checking
Date: Tue, 13 Sep 2016 18:33:50 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0


On 13/09/2016 18:27, Richard Henderson wrote:
> On 09/13/2016 09:10 AM, Paolo Bonzini wrote:
>> @@ -177,16 +231,15 @@ bool test_buffer_is_zero_next_accel(void)
>>  
>>  static bool select_accel_fn(const void *buf, size_t len)
>>  {
>> -    uintptr_t ibuf = (uintptr_t)buf;
>>  #ifdef CONFIG_AVX2_OPT
>> -    if (len % 128 == 0 && ibuf % 32 == 0 && (cpuid_cache & CACHE_AVX2)) {
>> +    if (len >= 128 && (cpuid_cache & CACHE_AVX2)) {
>>          return buffer_zero_avx2(buf, len);
>>      }
>> -    if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE4)) {
>> +    if (len >= 64 && (cpuid_cache & CACHE_SSE4)) {
>>          return buffer_zero_sse4(buf, len);
>>      }
>>  #endif
>> -    if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE2)) {
>> +    if (len >= 64 && (cpuid_cache & CACHE_SSE2)) {
>>          return buffer_zero_sse2(buf, len);
>>      }
> 
> You've dropped a major change to select_accel_fn here.
> 
> (1) The avx2 routine, as written, can support len >= 64, therefore a common
> test works for all of the vectorized functions.
> 
> (2) I had saved the pointer to the routine, so that we didn't have to
> repeatedly test multiple cpuid_cache bits.

Can you send a replacement for this patch only?

Thanks,

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]