qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/41] Migration cleanups and latency improvemen


From: Chegu Vinod
Subject: Re: [Qemu-devel] [PATCH 00/41] Migration cleanups and latency improvements
Date: Tue, 19 Feb 2013 09:59:33 -0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2

On 2/15/2013 9:46 AM, Paolo Bonzini wrote:
This series does many of the improvements that the migration thread
promised.  It removes buffering, lets a large amount of code run outside
the big QEMU lock, and removes some duplication between incoming and
outgoing migration.

Patches 1 to 7 are simple cleanups.

Patches 8 to 14 simplify the lifecycle of the migration thread and
the migration QEMUFile.

Patches 15 to 18 add fine-grained locking to the block migration
data structures, so that patches 19 to 21 can move RAM/block live
migration out of the big QEMU lock.  At this point blocking writes
will not starve other threads seeking to grab the big QEMU mutex:
patches 22 to 24 removes the buffering and cleanup the code.

Patches 25 to 28 are more cleanups.

Patches 29 to 33 improve QEMUFile so that patches 34 and 35 can
use QEMUFile to write out data, instead of MigrationState.
Patches 36 to 41 then can remove the useless QEMUFile wrapper
that remains.

Please review and test!  You can find these patches at
git://github.com/bonzini/qemu.git, branch migration-thread-20130115.

Juan Quintela (1):
  Rename buffered_ to migration_

Paolo Bonzini (40):
  migration: simplify while loop
  migration: always use vm_stop_force_state
  migration: move more error handling to migrate_fd_cleanup
  migration: push qemu_savevm_state_cancel out of qemu_savevm_state_*
  block-migration: remove useless calls to blk_mig_cleanup
  qemu-file: pass errno from qemu_fflush via f->last_error
  migration: use qemu_file_set_error to pass error codes back to
    qemu_savevm_state
  qemu-file: temporarily expose qemu_file_set_error and qemu_fflush
  migration: flush all data to fd when buffered_flush is called
  migration: use qemu_file_set_error
  migration: simplify error handling
  migration: do not nest flushing of device data
  migration: prepare to access s->state outside critical sections
  migration: cleanup migration (including thread) in the iothread
  block-migration: remove variables that are never read
  block-migration: small preparatory changes for locking
  block-migration: document usage of state across threads
  block-migration: add lock
  migration: reorder SaveVMHandlers members
  migration: run pending/iterate callbacks out of big lock
  migration: run setup callbacks out of big lock
  migration: yay, buffering is gone
  qemu-file: make qemu_fflush and qemu_file_set_error private again
  migration: eliminate last_round
  migration: detect error before sleeping
  migration: remove useless qemu_file_get_error check
  migration: use qemu_file_rate_limit consistently
  migration: merge qemu_popen_cmd with qemu_popen
  qemu-file: fsync a writable stdio QEMUFile
  qemu-file: check exit status when closing a pipe QEMUFile
  qemu-file: add writable socket QEMUFile
  qemu-file: simplify and export qemu_ftell
  migration: use QEMUFile for migration channel lifetime
  migration: use QEMUFile for writing outgoing migration data
  migration: use qemu_ftell to compute bandwidth
  migration: small changes around rate-limiting
  migration: move rate limiting to QEMUFile
  migration: move contents of migration_close to migrate_fd_cleanup
  migration: eliminate s->migration_file
  migration: inline migrate_fd_close

 arch_init.c                   |   14 ++-
 block-migration.c             |  167 +++++++++++++++------
 docs/migration.txt            |   20 +---
 include/migration/migration.h |   12 +--
 include/migration/qemu-file.h |   21 +--
 include/migration/vmstate.h   |   21 ++-
 include/qemu/atomic.h         |    1 +
 include/sysemu/sysemu.h       |    6 +-
 migration-exec.c              |   39 +-----
 migration-fd.c                |   47 +------
 migration-tcp.c               |   33 +----
 migration-unix.c              |   33 +----
 migration.c                   |  345 ++++++++---------------------------------
 savevm.c                      |  214 +++++++++++++++-----------
 util/osdep.c                  |    6 +-
 15 files changed, 367 insertions(+), 612 deletions(-)

.

'am still in the midst of reviewing the changes but gave them a try. The following are my preliminary observations :

- The mult-second freezes at the start of migration of larger guests (i.e. 128GB and higher) aren't observable with the above changes. (The simple timer script that does a gettimeofday every 100ms didn't complain about delays etc.).

- Noticed  improvements in bandwidth utilization during the iterative pre-copy phase and during the "downtime" phase.

- The total migration time reduced...more for larger guests (Note: The undesirably large actual "downtime" for larger guests is a different topic that still needs to be addressed independent of these changes).

Some details follow below...

Thanks
Vinod


Details:
----------

Host and Guest kernels are running : 3.8-rc5. 

Comparing upstream (Qemu 1.4.50) vs. Paolo's branch(Qemu 1.3.92 based) i.e.
git clone git://github.com/bonzini/qemu.git -b migration-thread-20130115

First set of experiments are with [not-so-interesting] *Idle* guests of different sizes.
The second experiment was with an OLTP workload.

A) Idle guests:
--------------------
(The migration speed was set to 10G and the downtime was set to 2)

1) 5vcpu/32G  - *idle* guest

QEMU 1.4.50:
total time: 31801 milliseconds
downtime: 2831 milliseconds

Paolo's branch:
total time: 29012 milliseconds
downtime: 1987 milliseconds

--
2) 10vcpu/64G - *idle* guest

QEMU 1.4.50:
total time: 62699 milliseconds
downtime: 2506 milliseconds

Paolo's branch:
total time: 59174 milliseconds
downtime: 2451 milliseconds

--
3) 10vcpu/128G  - *idle* guest

QEMU 1.4.50:
total time: 123179 milliseconds
downtime: 2566 milliseconds

address@hidden ~]# ./timer
delay of 3083 ms                   <- freeze (@start of migration)
delay of 1916 ms                   <- freeze (due to downtime)

Paolo's branch:
total time: 116809 milliseconds
downtime: 2703 milliseconds

address@hidden ~]# ./timer
delay of 2820 ms                 <- freeze (due to downtime)

--
4) 20vcpu/256G - *idle* guest

QEMU 1.4.50:
total time: 277775 milliseconds
downtime: 3718 milliseconds

address@hidden ~]# ./timer
delay of 6317 ms                 <- freeze (@ start of migration)
delay of 2952 ms                 <- freeze (due to downtime)

Paolo's branch:
total time: 261790 milliseconds
downtime: 3809 milliseconds

address@hidden ~]# ./timer
delay of 3982 ms            <-  freeze (due to downtime)

--
5) 40vcpu/512G - *idle* guest

QEMU 1.4.50:
total time: 631654 milliseconds
downtime: 7252 milliseconds

address@hidden ~]# ./timer
delay of 12713 ms              <- freeze (@ start of migration)
delay of 6099 ms                <- freeze (due to downtime)

Paolo's branch:
total time: 603252 milliseconds
downtime: 6452 milliseconds

address@hidden ~]# ./timer
delay of 6724 ms              <- freeze (due to downtime)   

--
6) 80vcpu/784G - *idle* guest

QEMU 1.4.50:
total time: 1003210 milliseconds
downtime: 8932 milliseconds

address@hidden ~]# ./timer
delay of 18941 ms               <- freeze (@ start of migration.)
delay of 8395 ms                 <- freeze (due to downtime)
delay of 2451 ms                 <- freeze (on new host...why?)

Paolo's branch:
total time: 959378 milliseconds
downtime: 8416 milliseconds

address@hidden ~]# ./timer
delay of 8938 ms                       <- freeze (due to downtime)
delay of 935 ms                         <- freeze (on new host...why?)

-------

B) Guest with an OLTP workload :
---------------------------------------------

Guest : 80vcpu / 784GB (yes i know that typical guests sizes today aren't this huge...but this is just an experiment keeping in mind that guests are continuing to get fatter)

OLTP workload with 100 users doing writes/reads.  Using tmpfs...as I don't yet have access to real I/O :-(  

Host was ~70% busy and the guest was ~60% busy.

The migration speed was set to 10G and the downtime was set to 4s.

No guest freezes observed but there were significant drops in the TPS at the start of migration etc. Observed about ~30-40% improvement in the bandwidth utilization during the iterative pre-copy phase.

The workload did NOT converge even after 30 mins or so...with either upstream qemu or with Paolo's changes (Note: the lack of convergence issue needs to be pursued separately...based on ideas proposed in the past).

reply via email to

[Prev in Thread] Current Thread [Next in Thread]