[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH v4 23/23] docs: Update pvrdma device documentation
From: |
Yuval Shaia |
Subject: |
[Qemu-devel] [PATCH v4 23/23] docs: Update pvrdma device documentation |
Date: |
Sun, 18 Nov 2018 14:28:43 +0200 |
Interface with the device is changed with the addition of support for
MAD packets.
Adjust documentation accordingly.
While there fix a minor mistake which may lead to think that there is a
relation between using RXE on host and the compatibility with bare-metal
peers.
Signed-off-by: Yuval Shaia <address@hidden>
---
docs/pvrdma.txt | 103 +++++++++++++++++++++++++++++++++++++++---------
1 file changed, 84 insertions(+), 19 deletions(-)
diff --git a/docs/pvrdma.txt b/docs/pvrdma.txt
index 5599318159..f82b2a69d2 100644
--- a/docs/pvrdma.txt
+++ b/docs/pvrdma.txt
@@ -9,8 +9,9 @@ It works with its Linux Kernel driver AS IS, no need for any
special guest
modifications.
While it complies with the VMware device, it can also communicate with bare
-metal RDMA-enabled machines and does not require an RDMA HCA in the host, it
-can work with Soft-RoCE (rxe).
+metal RDMA-enabled machines as peers.
+
+It does not require an RDMA HCA in the host, it can work with Soft-RoCE (rxe).
It does not require the whole guest RAM to be pinned allowing memory
over-commit and, even if not implemented yet, migration support will be
@@ -78,29 +79,93 @@ the required RDMA libraries.
3. Usage
========
+
+
+3.1 VM Memory settings
+======+++=============
Currently the device is working only with memory backed RAM
and it must be mark as "shared":
-m 1G \
-object memory-backend-ram,id=mb1,size=1G,share \
-numa node,memdev=mb1 \
-The pvrdma device is composed of two functions:
- - Function 0 is a vmxnet Ethernet Device which is redundant in Guest
- but is required to pass the ibdevice GID using its MAC.
- Examples:
- For an rxe backend using eth0 interface it will use its mac:
- -device vmxnet3,addr=<slot>.0,multifunction=on,mac=<eth0 MAC>
- For an SRIOV VF, we take the Ethernet Interface exposed by it:
- -device vmxnet3,multifunction=on,mac=<RoCE eth MAC>
- - Function 1 is the actual device:
- -device
pvrdma,addr=<slot>.1,backend-dev=<ibdevice>,backend-gid-idx=<gid>,backend-port=<port>
- where the ibdevice can be rxe or RDMA VF (e.g. mlx5_4)
- Note: Pay special attention that the GID at backend-gid-idx matches vmxnet's
MAC.
- The rules of conversion are part of the RoCE spec, but since manual conversion
- is not required, spotting problems is not hard:
- Example: GID: fe80:0000:0000:0000:7efe:90ff:fecb:743a
- MAC: 7c:fe:90:cb:74:3a
- Note the difference between the first byte of the MAC and the GID.
+
+3.2 MAD Multiplexer
+===================
+MAD Multiplexer is a service that exposes MAD-like interface for VMs in
+order to overcome the limitation where only single entity can register with
+MAD layer to send and receive RDMA-CM MAD packets.
+
+To build rdmacm-mux run
+# make rdmacm-mux
+
+The application accepts 3 command line arguments and exposes a UNIX socket
+to pass control and data to it.
+-s unix-socket-path Path to unix socket to listen on
+ (default /var/run/rdmacm-mux)
+-d rdma-device-name Name of RDMA device to register with
+ (default rxe0)
+-p rdma-device-port Port number of RDMA device to register with
+ (default 1)
+The final UNIX socket file name is a concatenation of the 3 arguments so
+for example for device mlx5_0 on port 2 this /var/run/rdmacm-mux-mlx5_0-2
+will be created.
+
+Please refer to contrib/rdmacm-mux for more details.
+
+
+3.3 PCI devices settings
+========================
+RoCE device exposes two functions - an Ethernet and RDMA.
+To support it, pvrdma device is composed of two PCI functions, an Ethernet
+device of type vmxnet3 on PCI slot 0 and a PVRDMA device on PCI slot 1. The
+Ethernet function can be used for other Ethernet purposes such as IP.
+
+
+3.4 Device parameters
+=====================
+- netdev: Specifies the Ethernet device on host. For Soft-RoCE (rxe) this
+ would be the Ethernet device used to create it. For any other physical
+ RoCE device this would be the netdev name of the device.
+- ibdev: The IB device name on host for example rxe0, mlx5_0 etc.
+- mad-chardev: The name of the MAD multiplexer char device.
+- ibport: In case of multi-port device (such as Mellanox's HCA) this
+ specify the port to use. If not set 1 will be used.
+- dev-caps-max-mr-size: The maximum size of MR.
+- dev-caps-max-qp: Maximum number of QPs.
+- dev-caps-max-sge: Maximum number of SGE elements in WR.
+- dev-caps-max-cq: Maximum number of CQs.
+- dev-caps-max-mr: Maximum number of MRs.
+- dev-caps-max-pd: Maximum number of PDs.
+- dev-caps-max-ah: Maximum number of AHs.
+
+Notes:
+- The first 3 parameters are mandatory settings, the rest have their
+ defaults.
+- The last 8 parameters (the ones that prefixed by dev-caps) defines the top
+ limits but the final values is adjusted by the backend device limitations.
+
+3.5 Example
+===========
+Define bridge device with vmxnet3 network backend:
+<interface type='bridge'>
+ <mac address='56:b4:44:e9:62:dc'/>
+ <source bridge='bridge1'/>
+ <model type='vmxnet3'/>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x10' function='0x0'
multifunction='on'/>
+</interface>
+
+Define pvrdma device:
+<qemu:commandline>
+ <qemu:arg value='-object'/>
+ <qemu:arg value='memory-backend-ram,id=mb1,size=1G,share'/>
+ <qemu:arg value='-numa'/>
+ <qemu:arg value='node,memdev=mb1'/>
+ <qemu:arg value='-chardev'/>
+ <qemu:arg value='socket,path=/var/run/rdmacm-mux-rxe0-1,id=mads'/>
+ <qemu:arg value='-device'/>
+ <qemu:arg
value='pvrdma,addr=10.1,ibdev=rxe0,netdev=bridge0,mad-chardev=mads'/>
+</qemu:commandline>
--
2.17.2
- [Qemu-devel] [PATCH v4 13/23] hw/pvrdma: Make sure PCI function 0 is vmxnet3, (continued)
- [Qemu-devel] [PATCH v4 13/23] hw/pvrdma: Make sure PCI function 0 is vmxnet3, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 14/23] hw/rdma: Initialize node_guid from vmxnet3 mac address, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 15/23] hw/pvrdma: Make device state depend on Ethernet function state, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 16/23] hw/pvrdma: Fill all CQE fields, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 17/23] hw/pvrdma: Fill error code in command's response, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 18/23] hw/rdma: Remove unneeded code that handles more that one port, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 20/23] hw/pvrdma: Clean device's resource when system is shutdown, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 19/23] vl: Introduce shutdown_notifiers, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 22/23] hw/rdma: Do not call rdma_backend_del_gid on an empty gid, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 21/23] hw/rdma: Do not use bitmap_zero_extend to free bitmap, Yuval Shaia, 2018/11/18
- [Qemu-devel] [PATCH v4 23/23] docs: Update pvrdma device documentation,
Yuval Shaia <=