|
From: | Chegu Vinod |
Subject: | Re: [Qemu-devel] [PATCH 00/41] Migration cleanups and latency improvements |
Date: | Tue, 19 Feb 2013 09:59:33 -0800 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 |
On 2/15/2013 9:46 AM, Paolo Bonzini
wrote:
This series does many of the improvements that the migration thread promised. It removes buffering, lets a large amount of code run outside the big QEMU lock, and removes some duplication between incoming and outgoing migration. Patches 1 to 7 are simple cleanups. Patches 8 to 14 simplify the lifecycle of the migration thread and the migration QEMUFile. Patches 15 to 18 add fine-grained locking to the block migration data structures, so that patches 19 to 21 can move RAM/block live migration out of the big QEMU lock. At this point blocking writes will not starve other threads seeking to grab the big QEMU mutex: patches 22 to 24 removes the buffering and cleanup the code. Patches 25 to 28 are more cleanups. Patches 29 to 33 improve QEMUFile so that patches 34 and 35 can use QEMUFile to write out data, instead of MigrationState. Patches 36 to 41 then can remove the useless QEMUFile wrapper that remains. Please review and test! You can find these patches at git://github.com/bonzini/qemu.git, branch migration-thread-20130115. Juan Quintela (1): Rename buffered_ to migration_ Paolo Bonzini (40): migration: simplify while loop migration: always use vm_stop_force_state migration: move more error handling to migrate_fd_cleanup migration: push qemu_savevm_state_cancel out of qemu_savevm_state_* block-migration: remove useless calls to blk_mig_cleanup qemu-file: pass errno from qemu_fflush via f->last_error migration: use qemu_file_set_error to pass error codes back to qemu_savevm_state qemu-file: temporarily expose qemu_file_set_error and qemu_fflush migration: flush all data to fd when buffered_flush is called migration: use qemu_file_set_error migration: simplify error handling migration: do not nest flushing of device data migration: prepare to access s->state outside critical sections migration: cleanup migration (including thread) in the iothread block-migration: remove variables that are never read block-migration: small preparatory changes for locking block-migration: document usage of state across threads block-migration: add lock migration: reorder SaveVMHandlers members migration: run pending/iterate callbacks out of big lock migration: run setup callbacks out of big lock migration: yay, buffering is gone qemu-file: make qemu_fflush and qemu_file_set_error private again migration: eliminate last_round migration: detect error before sleeping migration: remove useless qemu_file_get_error check migration: use qemu_file_rate_limit consistently migration: merge qemu_popen_cmd with qemu_popen qemu-file: fsync a writable stdio QEMUFile qemu-file: check exit status when closing a pipe QEMUFile qemu-file: add writable socket QEMUFile qemu-file: simplify and export qemu_ftell migration: use QEMUFile for migration channel lifetime migration: use QEMUFile for writing outgoing migration data migration: use qemu_ftell to compute bandwidth migration: small changes around rate-limiting migration: move rate limiting to QEMUFile migration: move contents of migration_close to migrate_fd_cleanup migration: eliminate s->migration_file migration: inline migrate_fd_close arch_init.c | 14 ++- block-migration.c | 167 +++++++++++++++------ docs/migration.txt | 20 +--- include/migration/migration.h | 12 +-- include/migration/qemu-file.h | 21 +-- include/migration/vmstate.h | 21 ++- include/qemu/atomic.h | 1 + include/sysemu/sysemu.h | 6 +- migration-exec.c | 39 +----- migration-fd.c | 47 +------ migration-tcp.c | 33 +---- migration-unix.c | 33 +---- migration.c | 345 ++++++++--------------------------------- savevm.c | 214 +++++++++++++++----------- util/osdep.c | 6 +- 15 files changed, 367 insertions(+), 612 deletions(-) . 'am still in the midst of reviewing the changes but gave them a try. The following are my preliminary observations : - The mult-second freezes at the start of migration of larger guests (i.e. 128GB and higher) aren't observable with the above changes. (The simple timer script that does a gettimeofday every 100ms didn't complain about delays etc.). - Noticed improvements in bandwidth utilization during the iterative pre-copy phase and during the "downtime" phase. - The total migration time reduced...more for larger guests (Note: The undesirably large actual "downtime" for larger guests is a different topic that still needs to be addressed independent of these changes). Some details follow below... Thanks Vinod Details: ---------- Host and Guest kernels are running : 3.8-rc5. Comparing upstream (Qemu 1.4.50) vs. Paolo's branch(Qemu 1.3.92 based) i.e. git clone git://github.com/bonzini/qemu.git -b migration-thread-20130115 First set of experiments are with [not-so-interesting] *Idle* guests of different sizes. The second experiment was with an OLTP workload. A) Idle guests: -------------------- (The migration speed was set to 10G and the downtime was set to 2) 1) 5vcpu/32G - *idle* guest QEMU 1.4.50: total time: 31801 milliseconds downtime: 2831 milliseconds Paolo's branch: total time: 29012 milliseconds downtime: 1987 milliseconds -- 2) 10vcpu/64G - *idle* guest QEMU 1.4.50: total time: 62699 milliseconds downtime: 2506 milliseconds Paolo's branch: total time: 59174 milliseconds downtime: 2451 milliseconds -- 3) 10vcpu/128G - *idle* guest QEMU 1.4.50: total time: 123179 milliseconds downtime: 2566 milliseconds address@hidden ~]# ./timer delay of 3083 ms <- freeze (@start of migration) delay of 1916 ms <- freeze (due to downtime) Paolo's branch: total time: 116809 milliseconds downtime: 2703 milliseconds address@hidden ~]# ./timer delay of 2820 ms <- freeze (due to downtime) -- 4) 20vcpu/256G - *idle* guest QEMU 1.4.50: total time: 277775 milliseconds downtime: 3718 milliseconds address@hidden ~]# ./timer delay of 6317 ms <- freeze (@ start of migration) delay of 2952 ms <- freeze (due to downtime) Paolo's branch: total time: 261790 milliseconds downtime: 3809 milliseconds address@hidden ~]# ./timer delay of 3982 ms <- freeze (due to downtime) -- 5) 40vcpu/512G - *idle* guest QEMU 1.4.50: total time: 631654 milliseconds downtime: 7252 milliseconds address@hidden ~]# ./timer delay of 12713 ms <- freeze (@ start of migration) delay of 6099 ms <- freeze (due to downtime) Paolo's branch: total time: 603252 milliseconds downtime: 6452 milliseconds address@hidden ~]# ./timer delay of 6724 ms <- freeze (due to downtime) -- 6) 80vcpu/784G - *idle* guest QEMU 1.4.50: total time: 1003210 milliseconds downtime: 8932 milliseconds address@hidden ~]# ./timer delay of 18941 ms <- freeze (@ start of migration.) delay of 8395 ms <- freeze (due to downtime) delay of 2451 ms <- freeze (on new host...why?) Paolo's branch: total time: 959378 milliseconds downtime: 8416 milliseconds address@hidden ~]# ./timer delay of 8938 ms <- freeze (due to downtime) delay of 935 ms <- freeze (on new host...why?) ------- B) Guest with an OLTP workload : --------------------------------------------- Guest : 80vcpu / 784GB (yes i know that typical guests sizes today aren't this huge...but this is just an experiment keeping in mind that guests are continuing to get fatter) OLTP workload with 100 users doing writes/reads. Using tmpfs...as I don't yet have access to real I/O :-( Host was ~70% busy and the guest was ~60% busy. The migration speed was set to 10G and the downtime was set to 4s. No guest freezes observed but there were significant drops in the TPS at the start of migration etc. Observed about ~30-40% improvement in the bandwidth utilization during the iterative pre-copy phase. The workload did NOT converge even after 30 mins or so...with either upstream qemu or with Paolo's changes (Note: the lack of convergence issue needs to be pursued separately...based on ideas proposed in the past). |
[Prev in Thread] | Current Thread | [Next in Thread] |