[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support
From: |
Yuanhan Liu |
Subject: |
Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support |
Date: |
Wed, 9 Sep 2015 21:19:06 +0800 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Wed, Sep 09, 2015 at 03:18:53PM +0300, Michael S. Tsirkin wrote:
> On Tue, Sep 08, 2015 at 03:38:46PM +0800, Yuanhan Liu wrote:
> > From: Ouyang Changchun <address@hidden>
> >
> > This patch is initially based a patch from Nikolay Nikolaev.
> >
> > Here is the latest version for adding vhost-user multiple queue support,
> > by creating a nc and vhost_net pair for each queue.
> >
> > What differs from last version is that this patch addresses two major
> > concerns from Michael, and fixes one hidden bug.
> >
> > - Concern #1: no feedback when the backend can't support # of
> > requested queues(by providing queues=# option).
> >
> > Here we address this issue by querying VHOST_USER_PROTOCOL_F_MQ
> > protocol features first, if not set, it means the backend don't
> > support mq feature, and let max_queues be 1. Otherwise, we send
> > another message, VHOST_USER_GET_QUEUE_NUM, for getting the max_queues
> > the backend supports.
> >
> > At vhost-user initiation stage(net_vhost_user_start), we then initiate
> > one queue first, which, in the meantime, also gets the max_queues.
> > We then do a simple compare: if requested_queues > max_queues, we
> > exit(I guess it's safe to exit here, as the vm is not running yet).
> >
> > - Concern #2: some messages are sent more times than necessary.
> >
> > We came an agreement with Michael that we could categorize vhost
> > user messages to 2 types: none-vring specific messages, which should
> > be sent only once, and vring specific messages, which should be sent
> > per queue.
> >
> > Here I introduced a helper function vhost_user_one_time_request(),
> > which lists following messages as none-vring specific messages:
> >
> > VHOST_USER_GET_FEATURES
> > VHOST_USER_SET_FEATURES
> > VHOST_USER_GET_PROTOCOL_FEATURES
> > VHOST_USER_SET_PROTOCOL_FEATURES
> > VHOST_USER_SET_OWNER
> > VHOST_USER_RESET_DEVICE
> > VHOST_USER_SET_MEM_TABLE
> > VHOST_USER_GET_QUEUE_NUM
> >
> > For above messages, we simply ignore them when they are not sent the first
> > time.
> >
> > I also observed a hidden bug from last version. We register the char dev
> > event handler N times, which is not necessary, as well as buggy: A later
> > register overwrites the former one, as qemu_chr_add_handlers() will not
> > chain those handlers, but instead overwrites the old one. So, in theory,
> > invoking qemu_chr_add_handlers N times will not end up with calling the
> > handler N times.
> >
> > However, the reason the handler is invoked N times is because we start
> > the backend(the connection server) first, and hence when
> > net_vhost_user_init()
> > is executed, the connection is already established, and
> > qemu_chr_add_handlers()
> > then invoke the handler immediately, which just looks like we invoke the
> > handler(net_vhost_user_event) directly from net_vhost_user_init().
> >
> > The solution I came up with is to make VhostUserState as an upper level
> > structure, making it includes N nc and vhost_net pairs:
> >
> > struct VhostUserNetPeer {
> > NetClientState *nc;
> > VHostNetState *vhost_net;
> > };
> >
> > typedef struct VhostUserState {
> > CharDriverState *chr;
> > bool running;
> > int queues;
> > struct VhostUserNetPeer peers[];
> > } VhostUserState;
> >
> > Signed-off-by: Nikolay Nikolaev <address@hidden>
> > Signed-off-by: Changchun Ouyang <address@hidden>
> > Signed-off-by: Yuanhan Liu <address@hidden>
> > ---
> > docs/specs/vhost-user.txt | 13 +++++
> > hw/virtio/vhost-user.c | 31 ++++++++++-
> > include/net/net.h | 1 +
> > net/vhost-user.c | 136
> > ++++++++++++++++++++++++++++++++--------------
> > qapi-schema.json | 6 +-
> > qemu-options.hx | 5 +-
> > 6 files changed, 146 insertions(+), 46 deletions(-)
> >
> > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > index 43db9b4..99d79be 100644
> > --- a/docs/specs/vhost-user.txt
> > +++ b/docs/specs/vhost-user.txt
> > @@ -135,6 +135,19 @@ As older slaves don't support negotiating protocol
> > features,
> > a feature bit was dedicated for this purpose:
> > #define VHOST_USER_F_PROTOCOL_FEATURES 30
> >
> > +Multiple queue support
> > +-------------------
> > +Multiple queue is treated as a protocal extension, hence the slave has to
> > +implement protocol features first. Multiple queues is supported only when
> > +the protocol feature VHOST_USER_PROTOCOL_F_MQ(bit 0) is set.
> > +
> > +The max # of queues the slave support can be queried with message
> > +VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the # of
> > requested
> > +queues is bigger than that.
> > +
> > +As all queues share one connection, the master use a unique index for each
> > +queue in the sent message to identify one specified queue.
> > +
>
> Please also document that only 2 queues are enabled initially.
> More queues are enabled dynamically.
> To enable more queues, the new message needs to be sent
> (added by patch 7).
Hi Michael,
Will do that in next version. Besides that, do you have more comments?
Say, does this patch address your two concerns?
--yliu
>
> > Message types
> > -------------
> >
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index 8046bc0..11e46b5 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -187,6 +187,23 @@ static int vhost_user_write(struct vhost_dev *dev,
> > VhostUserMsg *msg,
> > 0 : -1;
> > }
> >
> > +static bool vhost_user_one_time_request(VhostUserRequest request)
> > +{
> > + switch (request) {
> > + case VHOST_USER_GET_FEATURES:
> > + case VHOST_USER_SET_FEATURES:
> > + case VHOST_USER_GET_PROTOCOL_FEATURES:
> > + case VHOST_USER_SET_PROTOCOL_FEATURES:
> > + case VHOST_USER_SET_OWNER:
> > + case VHOST_USER_RESET_DEVICE:
> > + case VHOST_USER_SET_MEM_TABLE:
> > + case VHOST_USER_GET_QUEUE_NUM:
> > + return true;
> > + default:
> > + return false;
> > + }
> > +}
> > +
> > static int vhost_user_call(struct vhost_dev *dev, unsigned long int
> > request,
> > void *arg)
> > {
> > @@ -206,6 +223,14 @@ static int vhost_user_call(struct vhost_dev *dev,
> > unsigned long int request,
> > else
> > msg_request = request;
> >
> > + /*
> > + * For none-vring specific requests, like VHOST_USER_GET_FEATURES,
> > + * we just need send it once in the first time. For later such
> > + * request, we just ignore it.
> > + */
> > + if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0)
> > + return 0;
> > +
> > msg.request = msg_request;
> > msg.flags = VHOST_USER_VERSION;
> > msg.size = 0;
> > @@ -268,17 +293,20 @@ static int vhost_user_call(struct vhost_dev *dev,
> > unsigned long int request,
> > case VHOST_USER_SET_VRING_NUM:
> > case VHOST_USER_SET_VRING_BASE:
> > memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > + msg.addr.index += dev->vq_index;
> > msg.size = sizeof(m.state);
> > break;
> >
> > case VHOST_USER_GET_VRING_BASE:
> > memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > + msg.addr.index += dev->vq_index;
> > msg.size = sizeof(m.state);
> > need_reply = 1;
> > break;
> >
> > case VHOST_USER_SET_VRING_ADDR:
> > memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
> > + msg.addr.index += dev->vq_index;
> > msg.size = sizeof(m.addr);
> > break;
> >
> > @@ -286,7 +314,7 @@ static int vhost_user_call(struct vhost_dev *dev,
> > unsigned long int request,
> > case VHOST_USER_SET_VRING_CALL:
> > case VHOST_USER_SET_VRING_ERR:
> > file = arg;
> > - msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
> > + msg.u64 = (file->index + dev->vq_index) &
> > VHOST_USER_VRING_IDX_MASK;
> > msg.size = sizeof(m.u64);
> > if (ioeventfd_enabled() && file->fd > 0) {
> > fds[fd_num++] = file->fd;
> > @@ -330,6 +358,7 @@ static int vhost_user_call(struct vhost_dev *dev,
> > unsigned long int request,
> > error_report("Received bad msg size.");
> > return -1;
> > }
> > + msg.state.index -= dev->vq_index;
> > memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
> > break;
> > default:
> > diff --git a/include/net/net.h b/include/net/net.h
> > index 6a6cbef..6f20656 100644
> > --- a/include/net/net.h
> > +++ b/include/net/net.h
> > @@ -92,6 +92,7 @@ struct NetClientState {
> > NetClientDestructor *destructor;
> > unsigned int queue_index;
> > unsigned rxfilter_notify_enabled:1;
> > + void *opaque;
> > };
> >
> > typedef struct NICState {
> > diff --git a/net/vhost-user.c b/net/vhost-user.c
> > index 2d6bbe5..7d4ac69 100644
> > --- a/net/vhost-user.c
> > +++ b/net/vhost-user.c
> > @@ -15,10 +15,16 @@
> > #include "qemu/config-file.h"
> > #include "qemu/error-report.h"
> >
> > +struct VhostUserNetPeer {
> > + NetClientState *nc;
> > + VHostNetState *vhost_net;
> > +};
> > +
> > typedef struct VhostUserState {
> > - NetClientState nc;
> > CharDriverState *chr;
> > - VHostNetState *vhost_net;
> > + bool running;
> > + int queues;
> > + struct VhostUserNetPeer peers[];
> > } VhostUserState;
> >
> > typedef struct VhostUserChardevProps {
> > @@ -29,48 +35,75 @@ typedef struct VhostUserChardevProps {
> >
> > VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
> > {
> > - VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
> > + VhostUserState *s = nc->opaque;
> > assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
> > - return s->vhost_net;
> > -}
> > -
> > -static int vhost_user_running(VhostUserState *s)
> > -{
> > - return (s->vhost_net) ? 1 : 0;
> > + return s->peers[nc->queue_index].vhost_net;
> > }
> >
> > static int vhost_user_start(VhostUserState *s)
> > {
> > VhostNetOptions options;
> > + VHostNetState *vhost_net;
> > + int max_queues;
> > + int i = 0;
> >
> > - if (vhost_user_running(s)) {
> > + if (s->running)
> > return 0;
> > - }
> >
> > options.backend_type = VHOST_BACKEND_TYPE_USER;
> > - options.net_backend = &s->nc;
> > options.opaque = s->chr;
> >
> > - s->vhost_net = vhost_net_init(&options);
> > + options.net_backend = s->peers[i].nc;
> > + vhost_net = s->peers[i++].vhost_net = vhost_net_init(&options);
> > +
> > + max_queues = vhost_net_get_max_queues(vhost_net);
> > + if (s->queues >= max_queues) {
> > + error_report("you are asking more queues than supported: %d\n",
> > + max_queues);
> > + return -1;
> > + }
> > +
> > + for (; i < s->queues; i++) {
> > + options.net_backend = s->peers[i].nc;
> > +
> > + s->peers[i].vhost_net = vhost_net_init(&options);
> > + if (!s->peers[i].vhost_net)
> > + return -1;
> > + }
> > + s->running = true;
> >
> > - return vhost_user_running(s) ? 0 : -1;
> > + return 0;
> > }
> >
> > static void vhost_user_stop(VhostUserState *s)
> > {
> > - if (vhost_user_running(s)) {
> > - vhost_net_cleanup(s->vhost_net);
> > + int i;
> > + VHostNetState *vhost_net;
> > +
> > + if (!s->running)
> > + return;
> > +
> > + for (i = 0; i < s->queues; i++) {
> > + vhost_net = s->peers[i].vhost_net;
> > + if (vhost_net)
> > + vhost_net_cleanup(vhost_net);
> > }
> >
> > - s->vhost_net = 0;
> > + s->running = false;
> > }
> >
> > static void vhost_user_cleanup(NetClientState *nc)
> > {
> > - VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
> > + VhostUserState *s = nc->opaque;
> > + VHostNetState *vhost_net = s->peers[nc->queue_index].vhost_net;
> > +
> > + if (vhost_net)
> > + vhost_net_cleanup(vhost_net);
> >
> > - vhost_user_stop(s);
> > qemu_purge_queued_packets(nc);
> > +
> > + if (nc->queue_index == s->queues - 1)
> > + free(s);
> > }
> >
> > static bool vhost_user_has_vnet_hdr(NetClientState *nc)
> > @@ -89,7 +122,7 @@ static bool vhost_user_has_ufo(NetClientState *nc)
> >
> > static NetClientInfo net_vhost_user_info = {
> > .type = NET_CLIENT_OPTIONS_KIND_VHOST_USER,
> > - .size = sizeof(VhostUserState),
> > + .size = sizeof(NetClientState),
> > .cleanup = vhost_user_cleanup,
> > .has_vnet_hdr = vhost_user_has_vnet_hdr,
> > .has_ufo = vhost_user_has_ufo,
> > @@ -97,18 +130,25 @@ static NetClientInfo net_vhost_user_info = {
> >
> > static void net_vhost_link_down(VhostUserState *s, bool link_down)
> > {
> > - s->nc.link_down = link_down;
> > + NetClientState *nc;
> > + int i;
> >
> > - if (s->nc.peer) {
> > - s->nc.peer->link_down = link_down;
> > - }
> > + for (i = 0; i < s->queues; i++) {
> > + nc = s->peers[i].nc;
> >
> > - if (s->nc.info->link_status_changed) {
> > - s->nc.info->link_status_changed(&s->nc);
> > - }
> > + nc->link_down = link_down;
> > +
> > + if (nc->peer) {
> > + nc->peer->link_down = link_down;
> > + }
> >
> > - if (s->nc.peer && s->nc.peer->info->link_status_changed) {
> > - s->nc.peer->info->link_status_changed(s->nc.peer);
> > + if (nc->info->link_status_changed) {
> > + nc->info->link_status_changed(nc);
> > + }
> > +
> > + if (nc->peer && nc->peer->info->link_status_changed) {
> > + nc->peer->info->link_status_changed(nc->peer);
> > + }
> > }
> > }
> >
> > @@ -118,7 +158,8 @@ static void net_vhost_user_event(void *opaque, int
> > event)
> >
> > switch (event) {
> > case CHR_EVENT_OPENED:
> > - vhost_user_start(s);
> > + if (vhost_user_start(s) < 0)
> > + exit(1);
> > net_vhost_link_down(s, false);
> > error_report("chardev \"%s\" went up", s->chr->label);
> > break;
> > @@ -131,24 +172,28 @@ static void net_vhost_user_event(void *opaque, int
> > event)
> > }
> >
> > static int net_vhost_user_init(NetClientState *peer, const char *device,
> > - const char *name, CharDriverState *chr)
> > + const char *name, VhostUserState *s)
> > {
> > NetClientState *nc;
> > - VhostUserState *s;
> > + CharDriverState *chr = s->chr;
> > + int i;
> >
> > - nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
> > + for (i = 0; i < s->queues; i++) {
> > + nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
> >
> > - snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
> > - chr->label);
> > + snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
> > + i, chr->label);
> >
> > - s = DO_UPCAST(VhostUserState, nc, nc);
> > + /* We don't provide a receive callback */
> > + nc->receive_disabled = 1;
> >
> > - /* We don't provide a receive callback */
> > - s->nc.receive_disabled = 1;
> > - s->chr = chr;
> > - nc->queue_index = 0;
> > + nc->queue_index = i;
> > + nc->opaque = s;
> >
> > - qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
> > + s->peers[i].nc = nc;
> > + }
> > +
> > + qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, s);
> >
> > return 0;
> > }
> > @@ -227,8 +272,10 @@ static int net_vhost_check_net(void *opaque, QemuOpts
> > *opts, Error **errp)
> > int net_init_vhost_user(const NetClientOptions *opts, const char *name,
> > NetClientState *peer, Error **errp)
> > {
> > + int queues;
> > const NetdevVhostUserOptions *vhost_user_opts;
> > CharDriverState *chr;
> > + VhostUserState *s;
> >
> > assert(opts->kind == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
> > vhost_user_opts = opts->vhost_user;
> > @@ -244,6 +291,11 @@ int net_init_vhost_user(const NetClientOptions *opts,
> > const char *name,
> > return -1;
> > }
> >
> > + queues = vhost_user_opts->has_queues ? vhost_user_opts->queues : 1;
> > + s = g_malloc0(sizeof(VhostUserState) +
> > + queues * sizeof(struct VhostUserNetPeer));
> > + s->queues = queues;
> > + s->chr = chr;
> >
> > - return net_vhost_user_init(peer, "vhost_user", name, chr);
> > + return net_vhost_user_init(peer, "vhost_user", name, s);
> > }
> > diff --git a/qapi-schema.json b/qapi-schema.json
> > index 67fef37..55c33db 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -2480,12 +2480,16 @@
> > #
> > # @vhostforce: #optional vhost on for non-MSIX virtio guests (default:
> > false).
> > #
> > +# @queues: #optional number of queues to be created for multiqueue
> > vhost-user
> > +# (default: 1) (Since 2.5)
> > +#
> > # Since 2.1
> > ##
> > { 'struct': 'NetdevVhostUserOptions',
> > 'data': {
> > 'chardev': 'str',
> > - '*vhostforce': 'bool' } }
> > + '*vhostforce': 'bool',
> > + '*queues': 'int' } }
> >
> > ##
> > # @NetClientOptions
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 77f5853..5bfa7a3 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -1963,13 +1963,14 @@ The hubport netdev lets you connect a NIC to a QEMU
> > "vlan" instead of a single
> > netdev. @code{-net} and @code{-device} with parameter @option{vlan}
> > create the
> > required hub automatically.
> >
> > address@hidden -netdev vhost-user,address@hidden,vhostforce=on|off]
> > address@hidden -netdev
> > vhost-user,address@hidden,vhostforce=on|off][,queues=n]
> >
> > Establish a vhost-user netdev, backed by a chardev @var{id}. The chardev
> > should
> > be a unix domain socket backed one. The vhost-user uses a specifically
> > defined
> > protocol to pass vhost ioctl replacement messages to an application on the
> > other
> > end of the socket. On non-MSIX guests, the feature can be forced with
> > address@hidden
> > address@hidden Use 'address@hidden' to specify the number of queues to
> > +be created for multiqueue vhost-user.
> >
> > Example:
> > @example
> > --
> > 1.9.0
> >
- Re: [Qemu-devel] [PATCH 7/7] vhost-user: add a new message to disable/enable a specific virt queue., (continued)
- [Qemu-devel] [PATCH 2/7] vhost-user: add protocol feature negotiation, Yuanhan Liu, 2015/09/08
- [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Yuanhan Liu, 2015/09/08
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Eric Blake, 2015/09/08
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Ouyang, Changchun, 2015/09/09
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Michael S. Tsirkin, 2015/09/09
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support,
Yuanhan Liu <=
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Michael S. Tsirkin, 2015/09/09
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Jason Wang, 2015/09/14