[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_d
From: |
Orit Wasserman |
Subject: |
Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions |
Date: |
Tue, 26 Jun 2012 15:23:00 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 |
On 06/19/2012 09:08 PM, Eric Blake wrote:
> On 06/19/2012 09:43 AM, Orit Wasserman wrote:
>> Signed-off-by: Benoit Hudzia <address@hidden>
>> Signed-off-by: Petter Svard <address@hidden>
>> Signed-off-by: Aidan Shribman <address@hidden>
>> Signed-off-by: Orit Wasserman <address@hidden>
>> ---
>> savevm.c | 91
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 91 insertions(+), 0 deletions(-)
>>
>
>> +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>> + uint8_t *dst, int dlen)
>> +{
>> + uint32_t zrun_len = 0, nzrun_len = 0;
>> + int d = 0 , i = 0;
>> + uint8_t *nzrun_start = NULL;
>> +
>> + while (i < slen) {
>> + /* overflow */
>> + if (d + 2 > dlen) {
>> + return -1;
>> + }
>> +
>> + while (!(old_buf[i] ^ new_buf[i]) && ++i <= slen) {
>> + zrun_len++;
>> + }
>
> Can we vectorize this to compare a word at a time for improved speed?
> It would be really cool if glibc had an interface like memcmp(), but
> which returned the offset of the first difference instead of a
> comparison of the overall memory regions. But even without such an
> interface, pseudocode such as:
>
> while (i < slen) {
> check for overflow
> if (not aligned)
> up to sizeof(long) byte comparisons
> while (long comparisons && not end)
> increment i by sizeof(long)
> if (end of buffer) {
> either buffer unchanged, or we have trailing zrun
> break;
> }
> // found a word that differed
> do byte-wise comparisons on that word
> then back to word-size xor, checking for no zero bytes in the xor (or
> anywhere that a zero byte occurs in isolation, it is more compact to
> pass that lone byte as part of the xor encoding instead of breaking out
> to a zrun)
> }
I will add it.
>
>> +
>> + zrun_len = 0;
>> + nzrun_start = new_buf + i;
>> + while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= slen) {
>> + nzrun_len++;
>> + }
>
> Note that the encoding for a single unchanged byte with bytes changed on
> both sides is more compact to pass the unchanged byte as part of the
> nzrun, instead of breaking out into a zrun and starting a second nzrun.
> If I'm analyzing things correctly, that's also true for a run of two
> bytes, and it's not until we get to a run of three or more unchanged
> bytes where we start to get compression (except for the corner case of
> ending on a 1- or 2-byte zrun). Should we be altering this algorithm to
> take advantage of this?
I used to have this optimization but didn't see any noticeable performance
improvement.
It made the code much more complicated and less readable so I chose to remove
it.
I think adding SSE support will have better chance in improving the compression
performance.
>
>> +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
>> +{
>> + int i = 0, d = 0;
>> + uint32_t count = 0;
>> +
>> + while (i < slen) {
>> +
>> + /* zrun */
>> + i += uleb128_decode_small(src + i, &count);
>> + d += count;
>> +
>> + /* overflow */
>> + g_assert(d <= dlen);
>
> Again, is g_assert() appropriate for parsing user-supplied input, or are
> there more appropriate ways for gracefully failing an incoming migration
> attempt from corrupt data?
I will change it to return -1.
>
>> +
>> + /* completed decoding */
>> + if (i == slen - 1) {
>> + return d;
>> + }
>> +
>> + /* nzrun */
>> + i += uleb128_decode_small(src + i, &count);
>> +
>> + g_assert(d + count <= dlen);
>> +
>> + memcpy(dst + d , src + i, count);
>
> Am I correct that you are using XOR to compute the locations of nzrun,
> but then transferring the original data (and not the xor data) over the
> wire? This was not made clear in the doc patch of 3/13. I would
> suggest adding an example to the docs of a 4k page that consists only of
> a short string at the beginning of the page (and NUL bytes thereafter),
> then show both the before and after version of that page to demonstrate
> sparse memory changes, as well as the XBZRLE encoding of that page.
>
ok.
Orit
- [Qemu-devel] [PATCH v12 11/13] Add XBZRLE to ram_save_block and ram_save_live, (continued)
- [Qemu-devel] [PATCH v12 11/13] Add XBZRLE to ram_save_block and ram_save_live, Orit Wasserman, 2012/06/19
- [Qemu-devel] [PATCH v12 13/13] Add XBZRLE statistics, Orit Wasserman, 2012/06/19
- [Qemu-devel] [PATCH v12 12/13] Add set_cachesize command, Orit Wasserman, 2012/06/19
- [Qemu-devel] [PATCH v12 09/13] Add migration_end function, Orit Wasserman, 2012/06/19
- [Qemu-devel] [PATCH v12 08/13] Change ram_save_block to return -1 if there are no more changes, Orit Wasserman, 2012/06/19
- [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions, Orit Wasserman, 2012/06/19