qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] migration: reintroduce skipped zero pages


From: Peter Lieven
Subject: Re: [Qemu-devel] [RFC PATCH] migration: reintroduce skipped zero pages
Date: Mon, 12 May 2014 12:30:39 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

Am 12.05.2014 12:23, schrieb Paolo Bonzini:
> Il 09/05/2014 01:45, Peter Lieven ha scritto:
>> commit f1c72795a introduced skipping of all zero pages during
>> bulk phase of ram migration. In theory this should have worked,
>> however the underlying assumption that the memory of target VM
>> is totally empty (zeroed) was wrong. Altough qemu accepts an incoming
>> migration BIOS, ROMs or tables are set up. If e.g. a ROM differs
>> between source and target we get memory corruption if a page
>> is zero at the source and not at the target. Therefore the
>> original patch was reverted later on.
>>
>> This patch now reintroduces the feature to skip zero pages.
>> However, this time it has to be explicitely turned on through
>> a migration capability which should only be enabled if both
>> source and destination support it.
>>
>> The feature especially makes sense if you expect a significant portion
>> of zero pages while bandwidth or disk space is limited.
>> Because even if a zero page is compressed we still transfer 9 bytes for
>> each page.
>
> What is the actual effect of this?  Is the is_zero_range in 
> ram_handle_compressed actually a bottleneck?

Its not a bottle neck, but every small portion that can be saved wouldn't hurt.

>
> A positive effect on throughput is unlikely.  Multiple compressed pages are 
> sent in a single socket send.  If 1% of memory is zero (which seems already a 
> lot to me), you would save 0.002% bandwidth or disk space, and you would 
> waste a lot of time in the new loop of ram_load.

You are right I had this on the list since a time where every page was one send 
call.

In Windows guests the numer of zero pages stays high during livetime because 
all pages are zeroes out, but I agree the patch would only make sense if we 
wouldn't have to take
special care on the target side. If all pages were guaranteed to be zero there 
it would make sense, like the idea was in the original patch. This is why I put 
the RFC in ;-)

Peter

>
> Even on a freshly-booted guest, where 99% of the pages are zero, you would 
> get 0.2% savings only.
>
> Paolo
>
>> Signed-off-by: Peter Lieven <address@hidden>
>> ---
>>  arch_init.c                   |   44 
>> +++++++++++++++++++++++++++++++++--------
>>  include/migration/migration.h |    2 +-
>>  migration.c                   |    9 +++++++++
>>  qapi-schema.json              |   11 ++++++++---
>>  4 files changed, 54 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch_init.c b/arch_init.c
>> index 995f56d..2579302 100644
>> --- a/arch_init.c
>> +++ b/arch_init.c
>> @@ -123,7 +123,8 @@ static uint64_t bitmap_sync_count;
>>  #define RAM_SAVE_FLAG_EOS      0x10
>>  #define RAM_SAVE_FLAG_CONTINUE 0x20
>>  #define RAM_SAVE_FLAG_XBZRLE   0x40
>> -/* 0x80 is reserved in migration.h start with 0x100 next */
>> +/* 0x80 is reserved in migration.h */
>> +#define RAM_SAVE_FLAG_ZERO_TARGET 0x100
>>
>>  static struct defconfig_file {
>>      const char *filename;
>> @@ -575,8 +576,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>>      MemoryRegion *mr;
>>      ram_addr_t current_addr;
>>
>> -    if (!block)
>> +    if (!block) {
>>          block = QTAILQ_FIRST(&ram_list.blocks);
>> +    }
>>
>>      while (true) {
>>          mr = block->mr;
>> @@ -619,11 +621,16 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>>                      }
>>                  }
>>              } else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
>> -                acct_info.dup_pages++;
>> -                bytes_sent = save_block_hdr(f, block, offset, cont,
>> -                                            RAM_SAVE_FLAG_COMPRESS);
>> -                qemu_put_byte(f, 0);
>> -                bytes_sent++;
>> +                if (!ram_bulk_stage || !migrate_skip_zero_pages()) {
>> +                    acct_info.dup_pages++;
>> +                    bytes_sent = save_block_hdr(f, block, offset, cont,
>> +                                                RAM_SAVE_FLAG_COMPRESS);
>> +                    qemu_put_byte(f, 0);
>> +                    bytes_sent++;
>> +                } else {
>> +                    acct_info.skipped_pages++;
>> +                    bytes_sent = 0;
>> +                }
>>                  /* Must let xbzrle know, otherwise a previous (now 0'd) 
>> cached
>>                   * page would be stale
>>                   */
>> @@ -752,6 +759,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>  {
>>      RAMBlock *block;
>>      int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
>> +    uint64_t flags = 0;
>>
>>      mig_throttle_on = false;
>>      dirty_rate_high_cnt = 0;
>> @@ -812,7 +820,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>      migration_bitmap_sync();
>>      qemu_mutex_unlock_iothread();
>>
>> -    qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>> +    if (migrate_skip_zero_pages()) {
>> +        flags |= RAM_SAVE_FLAG_ZERO_TARGET;
>> +    }
>> +
>> +    qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE | flags);
>>
>>      QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>>          qemu_put_byte(f, strlen(block->idstr));
>> @@ -1082,6 +1094,22 @@ static int ram_load(QEMUFile *f, void *opaque, int 
>> version_id)
>>                      goto done;
>>                  }
>>
>> +                /* ensure that the target memory is really zero initialized
>> +                 * so we can skip zero pages in the bulk phase.
>> +                 * TODO: Ideally this should not be needed since we mmap the
>> +                 * target memory, however machine individual code may still
>> +                 * load BIOS, ROMs etc. altough we await an incoming 
>> migration.
>> +                 * (see commit 9ef051e5) */
>> +                if (flags & RAM_SAVE_FLAG_ZERO_TARGET) {
>> +                    ram_addr_t offset = 0;
>> +                    void *base = memory_region_get_ram_ptr(block->mr);
>> +                    for (; offset < block->length; offset += 
>> TARGET_PAGE_SIZE) {
>> +                        if (!is_zero_range(base + offset, 
>> TARGET_PAGE_SIZE)) {
>> +                            memset(base + offset, 0x00, TARGET_PAGE_SIZE);
>> +                        }
>> +                    }
>> +                }
>> +
>>                  total_ram_bytes -= length;
>>              }
>>          }
>> diff --git a/include/migration/migration.h b/include/migration/migration.h
>> index 3cb5ba8..4320f6a 100644
>> --- a/include/migration/migration.h
>> +++ b/include/migration/migration.h
>> @@ -144,8 +144,8 @@ void migrate_del_blocker(Error *reason);
>>
>>  bool migrate_rdma_pin_all(void);
>>  bool migrate_zero_blocks(void);
>> -
>>  bool migrate_auto_converge(void);
>> +bool migrate_skip_zero_pages(void);
>>
>>  int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>>                           uint8_t *dst, int dlen);
>> diff --git a/migration.c b/migration.c
>> index 52cda27..033b958 100644
>> --- a/migration.c
>> +++ b/migration.c
>> @@ -565,6 +565,15 @@ int migrate_use_xbzrle(void)
>>      return s->enabled_capabilities[MIGRATION_CAPABILITY_XBZRLE];
>>  }
>>
>> +bool migrate_skip_zero_pages(void)
>> +{
>> +    MigrationState *s;
>> +
>> +    s = migrate_get_current();
>> +
>> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_SKIP_ZERO_PAGES];
>> +}
>> +
>>  int64_t migrate_xbzrle_cache_size(void)
>>  {
>>      MigrationState *s;
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 36cb964..8add600 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -760,19 +760,24 @@
>>  #          mlock()'d on demand or all at once. Refer to docs/rdma.txt for 
>> usage.
>>  #          Disabled by default. (since 2.0)
>>  #
>> +# @auto-converge: If enabled, QEMU will automatically throttle down the 
>> guest
>> +#          to speed up convergence of RAM migration. (since 1.6)
>> +#
>>  # @zero-blocks: During storage migration encode blocks of zeroes 
>> efficiently. This
>>  #          essentially saves 1MB of zeroes per block on the wire. Enabling 
>> requires
>>  #          source and target VM to support this feature. To enable it is 
>> sufficient
>>  #          to enable the capability on the source VM. The feature is 
>> disabled by
>>  #          default. (since 1.6)
>>  #
>> -# @auto-converge: If enabled, QEMU will automatically throttle down the 
>> guest
>> -#          to speed up convergence of RAM migration. (since 1.6)
>> +# @skip-zero-pages: Skip zero pages during bulk phase of ram migration. 
>> Enabling requires
>> +#          source and target VM to support this feature. To enable it is 
>> sufficient
>> +#          to enable the capability on the source VM. The feature is 
>> disabled by
>> +#          default. (since 2.1)
>>  #
>>  # Since: 1.2
>>  ##
>>  { 'enum': 'MigrationCapability',
>> -  'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks'] }
>> +  'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks', 
>> 'skip-zero-pages'] }
>>
>>  ##
>>  # @MigrationCapabilityStatus
>>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]