[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC migration of zero pages
From: |
Peter Lieven |
Subject: |
Re: [Qemu-devel] RFC migration of zero pages |
Date: |
Thu, 31 Jan 2013 12:53:12 +0100 |
RFC patch is attached. Comments appreciated.
I have two concerns left:
a) what happens if a page turns from zero to non-zero in the first stage. Is
this page transferred in the same round or in the next?
b) what happens if live migration fails or is aborted and then again
a migration is started to the same target (if this is possible). Is the
memory at the target reinitialized?
Am 31.01.2013 um 10:37 schrieb Orit Wasserman <address@hidden>:
> On 01/31/2013 11:25 AM, Peter Lieven wrote:
>>
>> Am 31.01.2013 um 10:19 schrieb Orit Wasserman <address@hidden>:
>>
>>> On 01/31/2013 11:00 AM, Peter Lieven wrote:
>>>>
>>>> Am 31.01.2013 um 09:59 schrieb Orit Wasserman <address@hidden>:
>>>>
>>>>> On 01/31/2013 10:37 AM, Peter Lieven wrote:
>>>>>>
>>>>>> Am 31.01.2013 um 09:33 schrieb Orit Wasserman <address@hidden>:
>>>>>>
>>>>>>> On 01/31/2013 10:10 AM, Peter Lieven wrote:
>>>>>>>>
>>>>>>>> Am 31.01.2013 um 08:47 schrieb Orit Wasserman <address@hidden>:
>>>>>>>>
>>>>>>>>> On 01/31/2013 08:57 AM, Peter Lieven wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I just came across an idea and would like to have feedback if it
>>>>>>>>>> makes sence or not.
>>>>>>>>>>
>>>>>>>>>> If a VM is started without preallocated memory all memory that has
>>>>>>>>>> not been written to
>>>>>>>>>> reads as zeros, right?
>>>>>>>>> Hi,
>>>>>>>>> No the memory will be unmapped (we allocate on demand).
>>>>>>>>
>>>>>>>> Yes, but those unmapped pages will read as zeroes if the guest
>>>>>>>> accesses it?
>>>>>>> yes.
>>>>>>>>
>>>>>>>>>> If a VM with a lot of unwritten memory is migrated or if the memory
>>>>>>>>>> contains a lot
>>>>>>>>>> of zeroed out memory (e.g. Windows or Linux guest with page
>>>>>>>>>> sanitization) all this memory
>>>>>>>>>> is allocated on the target during live migration. Especially with
>>>>>>>>>> KSM this leads
>>>>>>>>>> to the problem that this memory is allocated and might be not
>>>>>>>>>> available completely as
>>>>>>>>>> merging of the pages will happen async.
>>>>>>>>>>
>>>>>>>>>> Wouldn't it make sense to not send zero pages in the first round
>>>>>>>>>> where the complete
>>>>>>>>>> ram is sent (if it is detectable that we are in this stage)?
>>>>>>>>> We send one byte per zero page at the moment (see is_dup_page) we can
>>>>>>>>> further optimizing it
>>>>>>>>> by not sending it.
>>>>>>>>> I have to point out that this is a very idle guest and we need to
>>>>>>>>> work on a loaded guest
>>>>>>>>> which is the more hard problem in migration.
>>>>>>>>
>>>>>>>> I was not talking about saving one byte (+ 8 bytes for header), my
>>>>>>>> concern was that we memset all (dup) pages
>>>>>>>> including the special case of a zero dup page on the migration target.
>>>>>>>> This allocates the memory or does it not?
>>>>>>>>
>>>>>>>
>>>>>>>> If my above assumption that the guest reads unmapped memory as zeroes
>>>>>>>> is right, this mapping
>>>>>>>> is not necessary in the case of a zero dup page.
>>>>>>>>
>>>>>>>> We just have to make sure that we are still in the very first round
>>>>>>>> when deciding not to sent
>>>>>>>> a zero page, because otherwise it could be a page that has become zero
>>>>>>>> during migration and
>>>>>>>> this of course has to be transferred.
>>>>>>>
>>>>>>> OK, so if we won't send the pages than it won't be allocate in the dst
>>>>>>> and it can improve both
>>>>>>> memory usage and reduce cpu consumption on it.
>>>>>>> That can be good for over commit scenario.
>>>>>>
>>>>>> Yes. On the Source host those zero pages have likely all been merged by
>>>>>> KSM already, but on the destination
>>>>>> they are allocated and initially consume real memory. This can be a
>>>>>> problem if a lot of incoming migrations happen
>>>>>> at the same time.
>>>>>
>>>>> That can be very effective.
>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Also I notice that the bottle neck in migrating unmapped pages is the
>>>>>>>>> detection of those pages
>>>>>>>>> because we map the pages in order to check them, for a large guest
>>>>>>>>> this is very expensive as mapping a page
>>>>>>>>> results in a page fault in the host.
>>>>>>>>> So what will be very helpful is actually locating those pages without
>>>>>>>>> mapping them
>>>>>>>>> which looks very complicated.
>>>>>>>>
>>>>>>>> This would be a nice improvement, but as you said a guest will sooner
>>>>>>>> or later allocate
>>>>>>>> all memory if it is not totally idle. However, bigger parts of this
>>>>>>>> memory might have been reset to zeroes.
>>>>>>>> This happens on page deallocation in a Windows Guest by default and
>>>>>>>> can also be enforced in LInux
>>>>>>>> with page sanitization.
>>>>>>>
>>>>>>> true, but it those cases we will want to zero the page in the dst as
>>>>>>> this is done for security reasons.
>>>>>>
>>>>>> if i migrate it to a destination where initially all memory is unmapped
>>>>>> not migrating the zero page turns it
>>>>>> into an unmapped page (which reads a zero?). where is the security
>>>>>> problem? its like rethinning on a storage.
>>>>>> Or do I understand something wrong here? Is the actual mapping
>>>>>> information migrated?
>>>>>
>>>>> I was referring to pages that had some data and were migrated, so when
>>>>> the guest OS zeros them we need to zero them
>>>>> also in destination because the data is also there.
>>>>
>>>> Ok, so can we with the current implementation effectively decide if a page
>>>> is transferred for the first time?
>>>
>>> In the old code (before 1.3 or 1.2 we add a separate function for the
>>> first full transfer but now we don't.
>>> So I guess you will need to implement it, it shouldn't be too complicated.
>>> I would add a flag to the existing code.
>>>>
>>>> Do we always migrate the complete memory once and then iterate over dirty
>>>> pages? I have to check the code
>>>> that searches for dirty pages to confirm that.
>>> We set all the bitmap as dirty in the beginning of migration so in the
>>> first iteration all pages will be sent.
>>> The code is in arch_init.c, look at ram_save_setup and ram_save_iterate.
>>
>> I will have a look and sent a RFC patch once I have tested it.
> Great!
diff --git a/arch_init.c b/arch_init.c
index dada6de..33f3b12 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -426,6 +426,8 @@ static void migration_bitmap_sync(void)
* 0 means no dirty pages
*/
+static uint64_t complete_rounds;
+
static int ram_save_block(QEMUFile *f, bool last_stage)
{
RAMBlock *block = last_seen_block;
@@ -451,6 +453,10 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
if (!block) {
block = QTAILQ_FIRST(&ram_list.blocks);
complete_round = true;
+ if (!complete_rounds) {
+ error_report("ram_save_block: finished bulk ram
migration");
+ }
+ complete_rounds++;
}
} else {
uint8_t *p;
@@ -463,10 +469,17 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
bytes_sent = -1;
if (is_dup_page(p)) {
acct_info.dup_pages++;
- bytes_sent = save_block_hdr(f, block, offset, cont,
+ /* we can skip transferring zero pages in the first round
because
+ memory is unmapped (reads as zero) at the target anyway or
initialized
+ to zero in case of mem-prealloc. */
+ if (complete_rounds || *p) {
+ bytes_sent = save_block_hdr(f, block, offset, cont,
RAM_SAVE_FLAG_COMPRESS);
- qemu_put_byte(f, *p);
- bytes_sent += 1;
+ qemu_put_byte(f, *p);
+ bytes_sent += 1;
+ } else {
+ bytes_sent = 1;
+ }
} else if (migrate_use_xbzrle()) {
current_addr = block->offset + offset;
bytes_sent = save_xbzrle_page(f, p, current_addr, block,
@@ -569,6 +582,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
qemu_mutex_lock_ramlist();
bytes_transferred = 0;
+ complete_rounds = 0;
reset_ram_globals();
if (migrate_use_xbzrle()) {
Re: [Qemu-devel] RFC migration of zero pages, Gleb Natapov, 2013/01/31
Re: [Qemu-devel] RFC migration of zero pages, Gleb Natapov, 2013/01/31
Re: [Qemu-devel] RFC migration of zero pages, Avi Kivity, 2013/01/31
Re: [Qemu-devel] RFC migration of zero pages, Gleb Natapov, 2013/01/31
Re: [Qemu-devel] RFC migration of zero pages, Avi Kivity, 2013/01/31
Re: [Qemu-devel] RFC migration of zero pages, Michael S. Tsirkin, 2013/01/31