qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] migration: broken ram_save_pending


From: Paolo Bonzini
Subject: Re: [Qemu-devel] migration: broken ram_save_pending
Date: Tue, 04 Feb 2014 11:46:59 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

Il 04/02/2014 08:15, Alexey Kardashevskiy ha scritto:
So. migration_thread() gets dirty pages number, tries to send them in a
loop but every iteration resets the number of pages to 96 and we start
again. After several tries we cross BUFFER_DELAY timeout and calculate new
@max_size and if the host machine is fast enough it is bigger than 393216
and next loop will finally finish the migration.

This should have happened pretty much immediately, because it's not while (pending()) but rather

            while (pending_size && pending_size >= max_size)

(it's an "if" in the code, but the idea is the same). And max_size is the following:

            max_size = bandwidth * migrate_max_downtime() / 1000000;

With the default throttling of 32 MiB/s, bandwidth must be something like 33000 (expressed in bytes/ms) with the default settings, and then max_size should be 33000*3*10^9 / 10^6 = 6000000. Where is my computation wrong?

Also, did you profile it to find the hotspot? Perhaps the bitmap operations are taking a lot of time. How big is the guest? Juan's patches were optimizing the bitmaps but not all of them apply to your case because of hpratio.

I can only think of something simple like below and not sure it does not
break other things. I would expect ram_save_pending() to return correct
number of bytes QEMU is going to send rather than number of pages
multiplied by 4096 but checking if all these pages are really empty is not
too cheap.

If you use qemu_update_position you will use very little bandwidth in the case where a lot of pages are zero.

What you mention in ram_save_pending() is not problematic just because of finding if the pages are empty, but also because you have to find the nonzero spots in the bitmap!

Paolo

Thanks!


diff --git a/arch_init.c b/arch_init.c
index 2ba297e..90949b0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -537,16 +537,17 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
                         acct_info.dup_pages++;
                     }
                 }
             } else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
                 acct_info.dup_pages++;
                 bytes_sent = save_block_hdr(f, block, offset, cont,
                                             RAM_SAVE_FLAG_COMPRESS);
                 qemu_put_byte(f, 0);
+                qemu_update_position(f, TARGET_PAGE_SIZE);
                 bytes_sent++;
             } else if (!ram_bulk_stage && migrate_use_xbzrle()) {
                 current_addr = block->offset + offset;
                 bytes_sent = save_xbzrle_page(f, p, current_addr, block,
                                               offset, cont, last_stage);
                 if (!last_stage) {
                     p = get_cached_data(XBZRLE.cache, current_addr);
                 }






reply via email to

[Prev in Thread] Current Thread [Next in Thread]