qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration


From: Liang Li
Subject: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration
Date: Mon, 2 Nov 2015 15:36:59 +0800

The patch 3ea3b7fa9af067982f34b of kvm introduces a lazy collapsing
of small sptes into large sptes mechanism, which intend to solve the
performance drop issue if live migration fails or is canceled. The
rmap will be scanned in the KVM_SET_USER_MEMORY_REGION ioctl context
when dirty logging is stopped so as to drop the small sptes, scanning
the rmap and drop the small sptes is a time consuming operation which
will take dozens of milliseconds, the actual time depends on VM's
memory size. For a VM with 8GB RAM, it will take about 30ms.

The current QEMU code stop the dirty logging during the pause and
copy stage by calling the migration_end() function. Now migration_end()
is a time consuming operation because it calls
memroy_global_dirty_log_stop(), which will trigger the scanning of rmap
and dropping small sptes operation. So call migration_end() before all
the vmsate data has already been transferred to the destination will
prolong VM downtime.

migration_end() should be deferred after all the data has been
transferred to the destination. blk_mig_cleanup() can be deferred too.

Effect of this patch
====================
For a VM with 8G RAM, this patch can reduce the VM downtime about 30 ms.

You can follow these steps to see the effect of this patch.

1. Start a VM with the command:
  ./qemu-system-x86_64 -enable-kvm -smp 4 -m 8192 -monitor stdio\
      /share/rhel6u5.qcow
   in the source host and 
  ./qemu-system-x86_64 -enable-kvm -smp 4 -m 8192 -monitor stdio\
      /share/rhel6u5.qcow -incoming tcp:0:4444
   in the destination host.
2. In the source side qemu monitor:
  (qemu) migrate_set_speed 0
  (qemu) migrate_set_downtime 0.01
  (qemu) migrate -d tcp:($DST_HOST_IP):4444
  (qemu) info migrate

The actual VM downtime in my environment:
=====================================
|without this patch| with this patch|
|-----------------------------------|
|      35ms        |     4ms        |
=====================================


Changes:
  * Remove qemu_savevm_sate_cancel() in migrate_fd_cleanup().
  * Add 2 more patches for code cleanup.
  * Add more details in the commit message.

Liang Li (4):
  migration: defer migration_end & blk_mig_cleanup
  migration: rename qemu_savevm_state_cancel
  migration: rename cancel to cleanup in SaveVMHandles
  migration: code clean up

 include/migration/vmstate.h |  2 +-
 include/sysemu/sysemu.h     |  2 +-
 migration/block.c           | 10 ++--------
 migration/migration.c       | 13 ++++++-------
 migration/ram.c             | 10 ++--------
 migration/savevm.c          | 10 +++++-----
 trace-events                |  2 +-
 7 files changed, 18 insertions(+), 31 deletions(-)

-- 
1.9.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]