|
From: | Chegu Vinod |
Subject: | Re: [Qemu-devel] Fwd: [RFC 00/27] Migration thread (WIP) |
Date: | Thu, 26 Jul 2012 11:41:09 -0700 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20120713 Thunderbird/14.0 |
Hello, Thanks for sharing this early/WIP version for evaluation. Still in the middle of code review..but wanted to share a couple of quick observations. 'tried to use it to migrate a 128G/10VCPU guest (speed set to 10G and downtime 2s). Once with no workload (i.e. idle guest) and the second was with a SpecJBB running in the guest. The idle guest case seemed to migrate fine... capabilities: xbzrle: off Migration status: completed transferred ram: 3811345 kbytes remaining ram: 0 kbytes total ram: 134226368 kbytes total time: 199743 milliseconds In the case of the SpecJBB I ran into issues during stage 3...the source host's qemu and the guest hung. I need to debug this more... (if already have some hints pl. let me know.). capabilities: xbzrle: off Migration status: active transferred ram: 127618578 kbytes remaining ram: 2386832 kbytes total ram: 134226368 kbytes total time: 526139 milliseconds (qemu) qemu_savevm_state_complete called qemu_savevm_state_complete calling ram_save_complete <--- hung somewhere after this ('need to get more info). --- As with the non-migration-thread version the Specjbb workload completed before the migration attempted to move to stage 3 (i.e. didn't converge while the workload was still active). BTW, with this version of the bits (i.e. while running SpecJBB which is supposed to dirty quite a bit of memory) I noticed that there wasn't much change in the b/w usage of the dedicated 10Gb private network link (It was still < ~1.5-3.0Gb/sec). Expected this to be a little better since we have a separate thread... not sure what else is in play here ? (numa locality of where the migration thread runs or something other basic tuning in the implementation ?) 'have a hi-level design question... (perhaps folks have already thought about it..and categorized it as potential future optimization..?) Would it be possible to off load the iothread completely [from all migration related activity] and have one thread (with the appropriate protection) get involved with getting the list of the dirty pages ? Have one or more threads dedicated for trying to push multiple streams of data to saturate the allocated network bandwidth ? This may help in large + busy guests. Comments? There are perhaps other implications of doing all of this (like burning more host cpu cycles) but perhaps this can be configurable based on user's needs... e.g. fewer but large guests on a host with no over subscription. Thanks Vinod
|
[Prev in Thread] | Current Thread | [Next in Thread] |