qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] migation: unbreak postcopy recovery


From: Balamuruhan S
Subject: Re: [Qemu-devel] [PATCH v3 0/4] migation: unbreak postcopy recovery
Date: Mon, 2 Jul 2018 13:34:45 +0530
User-agent: Mutt/1.9.2 (2017-12-15)

On Wed, Jun 27, 2018 at 09:22:42PM +0800, Peter Xu wrote:
> v3:
> - keep the recovery logic even for RDMA by dropping the 3rd patch and
>   touch up the original 4th patch (current 3rd patch) to suite that [Dave]
> 
> v2:
> - break the first patch into several
> - fix a QEMUFile leak
> 
> Please review.  Thanks,
Hi Peter,

I have applied this patchset with upstream Qemu for testing postcopy
pause recover feature in PowerPC,

I used NFS shared qcow2 between source and target host

source:
# ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic -vga none \
-machine pseries -m 64G,slots=128,maxmem=128G -smp 16,maxcpus=32 \
-device virtio-blk-pci,drive=rootdisk -drive \
file=/home/bala/sharing/hostos-ppc64le.qcow2,if=none,cache=none,format=qcow2,id=rootdisk
 \
-monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio \
-net user -redir tcp:2000::22

To keep the VM with workload I ran stress-ng inside guest,

# stress-ng --cpu 6 --vm 6 --io 6

target:
# ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic -vga none \
-machine pseries -m 64G,slots=128,maxmem=128G -smp 16,maxcpus=32 \
-device virtio-blk-pci,drive=rootdisk -drive \
file=/home/bala/sharing/hostos-ppc64le.qcow2,if=none,cache=none,format=qcow2,id=rootdisk
 \
-monitor telnet:127.0.0.1:1235,server,nowait -net nic,model=virtio \
-net user -redir tcp:2001::22 -incoming tcp:0:4445

enabled postcopy on both source and destination from qemu monitor

(qemu) migrate_set_capability postcopy-ram on

>From source qemu monitor,
(qemu) migrate -d tcp:10.45.70.203:4445
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off
zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off
release-ram: off block: off return-path: off pause-before-switchover:
off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off
late-block-activate: off 
Migration status: active
total time: 2331 milliseconds
expected downtime: 300 milliseconds
setup: 65 milliseconds
transferred ram: 38914 kbytes
throughput: 273.16 mbps
remaining ram: 67063784 kbytes
total ram: 67109120 kbytes
duplicate: 1627 pages
skipped: 0 pages
normal: 9706 pages
normal bytes: 38824 kbytes
dirty sync count: 1
page size: 4 kbytes
multifd bytes: 0 kbytes

triggered postcopy from source,
(qemu) migrate_start_postcopy

After triggering postcopy from source, in target I tried to pause the
postcopy migration

(qemu) migrate_pause

In target I see error as,
error while loading state section id 4(ram)
qemu-system-ppc64: Detected IO failure for postcopy. Migration paused.

In source I see error as,
qemu-system-ppc64: Detected IO failure for postcopy. Migration paused.

Later from target I try for recovery from target monitor,
(qemu) migrate_recover qemu+ssh://10.45.70.203/system
Migrate recovery is triggered already

but in source still it remains to be in postcopy-paused state
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off
zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off
release-ram: off block: off return-path: off pause-before-switchover:
off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off
late-block-activate: off 
Migration status: postcopy-paused
total time: 222841 milliseconds
expected downtime: 382991 milliseconds
setup: 65 milliseconds
transferred ram: 385270 kbytes
throughput: 265.06 mbps
remaining ram: 8150528 kbytes
total ram: 67109120 kbytes
duplicate: 14679647 pages
skipped: 0 pages
normal: 63937 pages
normal bytes: 255748 kbytes
dirty sync count: 2
page size: 4 kbytes
multifd bytes: 0 kbytes
dirty pages rate: 854740 pages
postcopy request count: 374

later I also tried to recover postcopy in source monitor,
(qemu) migrate_recover qemu+ssh://10.45.193.21/system
Migrate recover can only be run when postcopy is paused.

Looks to be it is broken, please help me if I missed something
in this test.

Thank you,
Bala
> 
> Peter Xu (4):
>   migration: delay postcopy paused state
>   migration: move income process out of multifd
>   migration: unbreak postcopy recovery
>   migration: unify incoming processing
> 
>  migration/ram.h       |  2 +-
>  migration/exec.c      |  3 ---
>  migration/fd.c        |  3 ---
>  migration/migration.c | 44 ++++++++++++++++++++++++++++++++++++-------
>  migration/ram.c       | 11 +++++------
>  migration/savevm.c    |  6 +++---
>  migration/socket.c    |  5 -----
>  7 files changed, 46 insertions(+), 28 deletions(-)
> 
> -- 
> 2.17.1
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]