|
From: | Chegu Vinod |
Subject: | [Qemu-devel] Live Migration of a large guest : guest frozen on the destination host |
Date: | Mon, 11 Jun 2012 07:02:06 -0700 |
User-agent: | Mozilla/5.0 (Windows NT 5.1; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 |
Hello, 'am having some issues trying to live migrate a large guest and would like to get some pointers on how to go about about debugging this. Here is some info. on the configuration Hardware : Two DL980's each with 80 Westmere cores + 1 TB of RAM. Using a 10G NIC private link (back to back) between two DL980's Host software used: Host 3.4.1 kernel Qemu versions used : Case 1: upstream qemu (1.1.50) - from qemu.git Case 2 : 1.0.92 + Juan Quintela's huge_memory changes Guest : 40VCPUs + 512GB Guest software used: RHEL6.3 RC1 (had some basic boot issues with 3.4.1 kernel and udevd..) Guest is booted off an FC LUN (visible to both the hosts). [Note: 'am not using virsh/virt-manager etc. but just the qemu to start the guest and also interact with the qemu monitor for live migration etc. Have set the migration speed to 10G but haven't changed the downtime (default : 30ms) ] Tried to live migrate this large guest..using either of the qemu's (i.e. Case 1 or Case2) and observed the following : When this guest was Idling 'was able to live migrate and have the guest come up fine on the other host. Was able to interact with the guest on the destination host. With workloads (e.g. AIM7-compute or SpecJBB or Google Stress App Test (SAT)) running in the guest if we tried to do live migration.. we observe that [after a while] the source host claims that the live migration is complete...but the guest on the destination host is often in a "frozen/hung" state.. can't really interact with it or ping it. Still trying to capture more information...but was also hoping to get some clues/tips from the experts on these mailing lists... [ BTW, is there a way to get a snap shot of the image of the guest on the source host just before the "downtime" (i.e. start of stage 3) on the source host and compare that with the image of the guest on the destination host just before its about to resume ? Is such a debugging feature already available ? ] Thanks Vinod |
[Prev in Thread] | Current Thread | [Next in Thread] |