[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations
From: |
Peter Lieven |
Subject: |
Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations |
Date: |
Mon, 25 Mar 2013 14:23:43 +0100 |
Am 25.03.2013 um 14:02 schrieb Paolo Bonzini <address@hidden>:
>> Maybe I should have explained the output more detailed. The percentages
>> are added. 35.8% in the second last column means that
>> 35.8% have a return value that is less than TARGET_PAGE_SIZE.
>> This was meant to illustrate at how many 64-bit chunks you have
>> to look to grab a certain percentage of non-zero pages.
>
> Ok, I wrongly understood that many pages had 4088 zero bytes but
> the last 8 were not zero. Now it's clearer, and more logical too. :)
>
>> Looking e.g. at the third value it means that looking at the first
>> three 64-bit chunks it will catch 34.0% of all pages.
>> It turns out that the non-zeroness of a page can be detected looking
>> at the first 256 or so bits and only a low
>> percentage turns out to be non-zero at a later position. So after
>> having checked the first chunks one by one
>> there is no big penalty looking at the remaining chunks with the
>> vectorized loop.
>
> I think it makes most sense to unroll the first four non-vectorized
> iterations, i.e. not use SSE and use three or four ifs. Either:
>
> if (foo[0]) return 0;
> if (foo[1]) return 8;
> if (foo[2]) return 16;
> if (foo[3]) return 24;
>
> or
>
> if (foo[0]) return 0;
> if (foo[1] | foo[2] | foo[3]) return 8;
>
> and then proceed on the remaining 4096-4*sizeof(long) bytes with
> the vectorized loop. foo+4 is aligned for SIMD operations on both
> 32- and 64-bit machines, which makes this a nice choice.
i can't start at foo+4 since the remaining X-4*sizeof(long) bytes
are not dividable by 8*sizeof(VECTYPE).
I could just do sty like the following:
const unsigned long *tmp = buf;
for (i = 0;
i < sizeof(VECTYPE) * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
/ sizeof(unsigned long);
i += 4) {
if (tmp[i + 0]) return i * sizeof(unsigned long);
if (tmp[i + 1]) return (i+1) * sizeof(unsigned long);
if (tmp[i + 2]) return (i+2) * sizeof(unsigned long);
if (tmp[i + 3]) return (i+3) * sizeof(unsigned long);
}
for (i = BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR;
i < len / sizeof(VECTYPE);
i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
…
}
Peter
>
> Paolo
>
>> Here is the distribution of return values for the Windows XP example:
>>
>> 25.62% 0.49% 7.86% 0.12% 0.15% 0.05% 0.05% 0.04% 0.05% 0.02% 0.03%
>> 0.02% 0.03% 0.02% 0.02% 0.01% 0.03% 0.02% 0.01% 0.02% 0.02% 0.01%
>> 0.02% 0.01% 0.01% 0.01% 0.01% 0.01% 0.02% 0.00% 0.01% 0.02% 0.03%
>> 0.01% 0.01% 0.01% 0.01% 0.01% 0.01% 0.00% 0.00% 0.01% 0.07% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.01% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.01% 0.02% 0.00% 0.00% 0.00%
>> 0.00% 0.01% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.02% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.03% 0.00% 0.00% 0.00%
>> 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.02% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.01%
>> 0.00% 0.00% 0.00% 0.02% 0.00% 0.00% 0.02% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.01% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01%
>> 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
>> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 64.23%
>>
>> The last value is the percentage of return value of TARGET_PAGE_SIZE
>> meaning the page is all zero.
>>
>> Peter
>>
>>
- Re: [Qemu-devel] [PATCHv4 7/9] migration: do not sent zero pages in bulk stage, (continued)
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/22
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/22
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/22
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/23
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations,
Peter Lieven <=
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/26
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/26