[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] broken incoming migration

From: Peter Lieven
Subject: Re: [Qemu-devel] broken incoming migration
Date: Mon, 10 Jun 2013 08:50:15 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6

On 10.06.2013 08:39, Alexey Kardashevskiy wrote:
On 06/09/2013 05:27 PM, Peter Lieven wrote:
Am 09.06.2013 um 05:09 schrieb Alexey Kardashevskiy <address@hidden>:

On 06/09/2013 01:01 PM, Wenchao Xia wrote:
于 2013-6-9 10:34, Alexey Kardashevskiy 写道:
On 06/09/2013 12:16 PM, Wenchao Xia wrote:
于 2013-6-8 16:30, Alexey Kardashevskiy 写道:
On 06/08/2013 06:27 PM, Wenchao Xia wrote:
On 04.06.2013 16:40, Paolo Bonzini wrote:
Il 04/06/2013 16:38, Peter Lieven ha scritto:
On 04.06.2013 16:14, Paolo Bonzini wrote:
Il 04/06/2013 15:52, Peter Lieven ha scritto:
On 30.05.2013 16:41, Paolo Bonzini wrote:
Il 30/05/2013 16:38, Peter Lieven ha scritto:
You could also scan the page for nonzero
values before writing it.
i had this in mind, but then choosed the other
approach.... turned out to be a bad idea.

alexey: i will prepare a patch later today,
could you then please verify it fixes your

paolo: would we still need the madvise or is
it enough to not write the zeroes?
It should be enough to not write them.
Problem: checking the pages for zero allocates
them. even at the source.
It doesn't look like.  I tried this program and top
doesn't show an increasing amount of reserved

#include <stdio.h> #include <stdlib.h> int main() {
char *x = malloc(500 << 20); int i, j; for (i = 0; i
< 500; i += 10) { for (j = 0; j < 10 << 20; j +=
4096) { *(volatile char*) (x + (i << 20) + j); }
getchar(); } }
strange. we are talking about RSS size, right?
None of the three top values change, and only VIRT is
500 MB.
is the malloc above using mmapped memory?

which kernel version do you use?

what avoids allocating the memory for me is the
following (with whatever side effects it has ;-))
This would also fail to migrate any page that is swapped
out, breaking overcommit in a more subtle way. :)

the following does also not allocate memory, but qemu
Hi, Peter As the patch writes

"not sending zero pages breaks migration if a page is zero
at the source but not at the destination."

I don't understand why it would be trouble, shouldn't all
page not received in dest be treated as zero pages?

How would the destination guest know if some page must be
cleared? The previous patch (which Peter reverted) did not
send anything for the pages which were zero on the source
If an page was not received and destination knows that page
should exist according to total size, fill it with zero at
destination, would it solve the problem?
It is _live_ migration, the source sends changes, same pages can
change and be sent several times. So we would need to turn
tracking on on the destination to know if some page was received
from the source or changed by the destination itself (by writing
there bios/firmware images, etc) and then clear pages which were
touched by the destination and were not sent by the source.
OK, I can understand the problem is, for example: Destination boots
up with 0x0000-0xFFFF filled with bios image. Source forgot to send
zero pages in 0x0000-0xFFFF.

The source did not forget, instead it zeroed these pages during its
life and thought that they must be zeroed at the destination already
(as the destination did not start and did not have a chance to write
something there).

After migration destination got 0x0000-0xFFFF dirty(different with
Yep. And those pages were empty on the source what made debugging very
easy :)

Thanks for explain.

This seems refer to the migration protocol: how should the guest
treat unsent pages. The patch causing the problem, actually treat
zero pages as "not to sent" at source, but another half is missing:
treat "not received" as zero pages at destination. I guess if second
half is added, problem is gone: after page transfer completed,
before destination resume, fill zero in "not received" pages.

Make a working patch, we'll discuss it :) I do not see much
acceleration coming from there.
I would also not spent much time with this. I would either look to find
an easy way to fix the initialization code to not unneccessarily load
data into RAM or i will sent a v2 of my patch following Eric's
There is no easy way to implement the flag and keep your original patch as
we have to implement this flag in all architectures which got broken by
your patch and I personally can fix only PPC64-pseries but not the others.

Furthermore your revert + new patches perfectly solve the problem, why
would we want to bother now with this new flag which nobody really needs
right now?

Please, please, revert the original patch or I'll try to do it :)

I tried, but there where concerns by the community. Alternativly I found
the following alternate solution. Please drop the 2 patches and try the

diff --git a/arch_init.c b/arch_init.c
index 5d32ecf..458bf8c 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -799,6 +799,8 @@ static int ram_load(QEMUFile *f, void *opaque, int 
                 while (total_ram_bytes) {
                     RAMBlock *block;
                     uint8_t len;
+                    void *base;
+                    ram_addr_t offset;

                     len = qemu_get_byte(f);
                     qemu_get_buffer(f, (uint8_t *)id, len);
@@ -822,6 +824,14 @@ static int ram_load(QEMUFile *f, void *opaque, int 
                         goto done;

+                    base = memory_region_get_ram_ptr(block->mr);
+                    for (offset = 0; offset < block->length;
+                         offset += TARGET_PAGE_SIZE) {
+                        if (!is_zero_page(base + offset)) {
+                            memset(base + offset, 0x00, TARGET_PAGE_SIZE);
+                        }
+                    }
                     total_ram_bytes -= length;

This is done at setup time so there is no additional cost for zero checking at 
each compressed page
coming in.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]