qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Live migration results in non-working virtio-net device (so


From: Neil Skrypuch
Subject: [Qemu-devel] Live migration results in non-working virtio-net device (sometimes)
Date: Thu, 30 Jan 2014 13:23:04 -0500
User-agent: KMail/4.10.2 (Linux/3.9.1-gentoo-r1; KDE/4.10.2; x86_64; ; )

First, let me briefly outline the way we use live migration, as it is probably 
not typical. We use live migration (with block migration) to make backups of 
VMs with zero downtime. The basic process goes like this:

1) migrate src VM -> dest VM
2) migration completes
3) cont src VM
4) gracefully shut down dest VM
5) dest VM's disk image is now a valid backup

In general, this works very well.

Up until now we have been using qemu-kvm 1.1.2 and have not had any issues 
with the above process. I am now attempting to upgrade us to a newer version 
of qemu, but all newer versions I've tried occasionally result in the virtio-
net device ceasing to function on the src VM after step 3.

I am able to reproduce this reliably (given enough iterations), it happens in 
roughly 2% of all migrations.

Here is the complete qemu command line for the src VM:

/usr/bin/qemu-system-x86_64 -machine accel=kvm -drive 
file=/var/lib/kvm/testbackup.polldev.com.img,if=virtio -m 2048 -smp 
4,cores=4,sockets=1,threads=1 -net 
nic,macaddr=52:54:98:00:00:00,model=virtio -net tap,script=/etc/qemu-ifup-
br2,downscript=no -curses -name 
"testbackup.polldev.com",process=testbackup.polldev.com -monitor 
unix:/var/lib/kvm/monitor/testbackup,server,nowait

The dest VM:

/usr/bin/qemu-system-x86_64 -machine accel=kvm -drive 
file=/backup/testbackup.polldev.com.img.bak20140129,if=virtio -m 2048 -smp 
4,cores=4,sockets=1,threads=1 -net 
nic,macaddr=52:54:98:00:00:00,model=virtio -net tap,script=no,downscript=no -
curses -name "testbackup.polldev.com",process=testbackup.polldev.com -monitor 
unix:/var/lib/kvm/monitor/testbackup.bak,server,nowait -incoming tcp:0:4444

The migration is performed like so:

echo "migrate -b tcp:localhost:4444" | socat STDIO UNIX-
CONNECT:/var/lib/kvm/monitor/testbackup
echo "migrate_set_speed 1G" | socat STDIO UNIX-
CONNECT:/var/lib/kvm/monitor/testbackup
#wait
echo cont | socat STDIO UNIX-CONNECT:/var/lib/kvm/monitor/testbackup

The guest in question is a minimal install of CentOS 6.5.

I have observed this issue across the following qemu versions:

qemu 1.4.2
qemu 1.6.0
qemu 1.6.1
qemu 1.7.0

I also attempted to test qemu 1.5.3, but live migration flat out crashed there 
(totally different issue).

I have also tested a number of other scenarios with qemu 1.6.0, all of which 
exhibit the same failure mode:

qemu 1.6.0 + host kernel 3.1.0
qemu 1.6.0 + host kernel 3.10.7
qemu 1.6.0 + host kernel 3.10.17
qemu 1.6.0 + virtio with -netdev/-device syntax
qemu 1.6.0 + accel=tcg

The one case I have found that works properly is the following:

qemu 1.6.0 + e1000

It is worth noting that when the virtio-net device ceases to function in the 
guest that removing and reinserting the virtio-net kernel module results in 
the device working again (except in 1.4.2, this had no effect there).

As mentioned above I can reproduce this with minimal effort, and am willing to 
test out any patches or provide further details as necessary.

- Neil



reply via email to

[Prev in Thread] Current Thread [Next in Thread]