[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [RFC PATCH RDMA support v2: 1/6] add openfabrics RDMA libra
From: |
Michael R. Hines |
Subject: |
[Qemu-devel] [RFC PATCH RDMA support v2: 1/6] add openfabrics RDMA libraries, configure options to build |
Date: |
Mon, 11 Feb 2013 17:49:52 -0500 |
From: "Michael R. Hines" <address@hidden>
This patchest introduces RDMA-based live-migration to QEMU.
A copy of this documentation is located online:
http://wiki.qemu.org/Features/RDMALiveMigration
DESIGN:
==========
1. In order to provide maximum cross-device compatibility, we use the
librdmacm library, which abstracts out the RDMA capabilities of each
individual type of RDMA device, including infiniband, iWARP, as well
as RoCE. This patch has been tested on both RoCE and infiniband
devices from Mellanox.
2. A new file named "migration-rdma.c" contains the core code required
to perform librdmacm connection establishment and the transfer of
actual RDMA contents.
3. Files "arch_init.c" and "savevm.c" have been modified to transfer the
VM's memory in the standard live migration path using RMDA memory
instead of using TCP.
4. All of the original logic for migration of devices and protocol
synchronization does not change - that happens simultaneously over TCP
as it normally does.
5. Currently, the XBZRLE capability and the detection of zero pages
(dup_page()) significantly slow down the empircal throughput observed
when RDMA is activated, so the code path skips these capabilities when
RDMA is enabled. Hopefully, we can stop doing this in the future and
come up with a way to preserve these capabilities simultaneously with
the use of RDMA.
PERFORMANCE:
============
Using a 40gbps infinband link performing a worst-case stress test:
RDMA Throughput With $ stress --vm-bytes 1024M --vm 1 --vm-keep
Approximately 26 gpbs
1. Average worst-case throughput
TCP Throughput With $ stress --vm-bytes 1024M --vm 1 --vm-keep
2. Approximately 8 gpbs (using IPOIB IP over Infiniband)
Average downtime (stop time) ranges between 28 and 33 milliseconds.
An *exhaustive* paper (2010) shows additional performance details
linked on the QEMU wiki:
http://wiki.qemu.org/Features/RDMALiveMigration
USAGE:
==========
Complete instructions for compiling and running with RDMA are also
available on the wiki (probably too much for a cover letter).
Signed-off-by: Michael R. Hines <address@hidden>
---
Makefile.objs | 1 +
configure | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+)
diff --git a/Makefile.objs b/Makefile.objs
index 68eb0ce..38767cc 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -57,6 +57,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration.o migration-tcp.o
+common-obj-$(CONFIG_RDMA) += migration-rdma.o
common-obj-y += qemu-char.o #aio.o
common-obj-y += block-migration.o
common-obj-y += page_cache.o
diff --git a/configure b/configure
index b7635e4..893935f 100755
--- a/configure
+++ b/configure
@@ -170,6 +170,7 @@ xfs=""
vhost_net="no"
kvm="no"
+rdma="no"
gprof="no"
debug_tcg="no"
debug="no"
@@ -897,6 +898,10 @@ for opt do
;;
--enable-virtio-blk-data-plane) virtio_blk_data_plane="yes"
;;
+ --enable-rdma) rdma="yes"
+ ;;
+ --disable-rdma) rdma="no"
+ ;;
*) echo "ERROR: unknown option $opt"; show_help="yes"
;;
esac
@@ -1087,6 +1092,8 @@ echo " --enable-bluez enable bluez stack
connectivity"
echo " --disable-slirp disable SLIRP userspace network connectivity"
echo " --disable-kvm disable KVM acceleration support"
echo " --enable-kvm enable KVM acceleration support"
+echo " --disable-rdma disable RDMA-based migration support"
+echo " --enable-rdma enable RDMA-based migration support"
echo " --enable-tcg-interpreter enable TCG with bytecode interpreter (TCI)"
echo " --disable-nptl disable usermode NPTL support"
echo " --enable-nptl enable usermode NPTL support"
@@ -1718,6 +1725,18 @@ EOF
libs_softmmu="$sdl_libs $libs_softmmu"
fi
+if test "$rdma" = "yes" ; then
+ cat > $TMPC <<EOF
+#include <rdma/rdma_cma.h>
+int main(void) { return 0; }
+EOF
+ rdma_libs="-lrdmacm"
+ if ! compile_prog "" "$rdma_libs" ; then
+ feature_not_found "rdma"
+ fi
+
+fi
+
##########################################
# VNC TLS/WS detection
if test "$vnc" = "yes" -a \( "$vnc_tls" != "no" -o "$vnc_ws" != "no" \) ; then
@@ -3318,6 +3337,7 @@ echo "Linux AIO support $linux_aio"
echo "ATTR/XATTR support $attr"
echo "Install blobs $blobs"
echo "KVM support $kvm"
+echo "RDMA support $rdma"
echo "TCG interpreter $tcg_interpreter"
echo "fdt support $fdt"
echo "preadv support $preadv"
@@ -4278,6 +4298,11 @@ if [ "$pixman" = "internal" ]; then
echo "config-host.h: subdir-pixman" >> $config_host_mak
fi
+if test "$rdma" = "yes" ; then
+echo "CONFIG_RDMA=y" >> $config_host_mak
+echo "LIBS+=$rdma_libs" >> $config_host_mak
+fi
+
# build tree in object directory in case the source is not in the current
directory
DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32"
DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas"
--
1.7.10.4
- [Qemu-devel] [RFC PATCH RDMA support v2: 1/6] add openfabrics RDMA libraries, configure options to build,
Michael R. Hines <=
[Qemu-devel] [RFC PATCH RDMA support v2: 3/6] install new monitor commands and setup RDMA capabilities, Michael R. Hines, 2013/02/11