[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support |
Date: |
Thu, 27 Aug 2015 16:05:06 +0300 |
On Tue, Aug 25, 2015 at 03:25:54AM +0000, Ouyang, Changchun wrote:
> Hi Michael,
>
> > -----Original Message-----
> > From: address@hidden [mailto:snabb-
> > address@hidden On Behalf Of Michael S. Tsirkin
> > Sent: Thursday, August 13, 2015 5:19 PM
> > To: Ouyang, Changchun
> > Cc: address@hidden; address@hidden;
> > address@hidden; address@hidden;
> > address@hidden; Long, Thomas
> > Subject: [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue
> > support
> >
> > On Wed, Aug 12, 2015 at 02:25:41PM +0800, Ouyang Changchun wrote:
> > > Based on patch by Nikolay Nikolaev:
> > > Vhost-user will implement the multi queue support in a similar way to
> > > what vhost already has - a separate thread for each queue.
> > > To enable the multi queue functionality - a new command line parameter
> > > "queues" is introduced for the vhost-user netdev.
> > >
> > > The RESET_OWNER change is based on commit:
> > > 294ce717e0f212ed0763307f3eab72b4a1bdf4d0
> > > If it is reverted, the patch need update for it accordingly.
> > >
> > > Signed-off-by: Nikolay Nikolaev <address@hidden>
> > > Signed-off-by: Changchun Ouyang <address@hidden>
> > > ---
> > > Changes since v5:
> > > - fix the message descption for VHOST_RESET_OWNER in vhost-user txt
> > >
> > > Changes since v4:
> > > - remove the unnecessary trailing '\n'
> > >
> > > Changes since v3:
> > > - fix one typo and wrap one long line
> > >
> > > Changes since v2:
> > > - fix vq index issue for set_vring_call
> > > When it is the case of VHOST_SET_VRING_CALL, The vq_index is not
> > initialized before it is used,
> > > thus it could be a random value. The random value leads to crash in
> > > vhost
> > after passing down
> > > to vhost, as vhost use this random value to index an array index.
> > > - fix the typo in the doc and description
> > > - address vq index for reset_owner
> > >
> > > Changes since v1:
> > > - use s->nc.info_str when bringing up/down the backend
> > >
> > > docs/specs/vhost-user.txt | 7 ++++++-
> > > hw/net/vhost_net.c | 3 ++-
> > > hw/virtio/vhost-user.c | 11 ++++++++++-
> > > net/vhost-user.c | 37 ++++++++++++++++++++++++-------------
> > > qapi-schema.json | 6 +++++-
> > > qemu-options.hx | 5 +++--
> > > 6 files changed, 50 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > index 70da3b1..9390f89 100644
> > > --- a/docs/specs/vhost-user.txt
> > > +++ b/docs/specs/vhost-user.txt
> > > @@ -135,6 +135,11 @@ As older slaves don't support negotiating
> > > protocol features, a feature bit was dedicated for this purpose:
> > > #define VHOST_USER_F_PROTOCOL_FEATURES 30
> > >
> > > +Multi queue support
> > > +-------------------
> > > +The protocol supports multiple queues by setting all index fields in
> > > +the sent messages to a properly calculated value.
> > > +
> > > Message types
> > > -------------
> > >
> > > @@ -198,7 +203,7 @@ Message types
> > >
> > > Id: 4
> > > Equivalent ioctl: VHOST_RESET_OWNER
> > > - Master payload: N/A
> > > + Master payload: vring state description
> > >
> > > Issued when a new connection is about to be closed. The Master
> > > will no
> > > longer own this connection (and will usually close it).
> >
> > This is an interface change, isn't it?
> > We can't make it unconditionally, need to make it dependent on a protocol
> > flag.
Pls remember to fix this one.
> >
> > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index
> > > 1f25cb3..9cd6c05 100644
> > > --- a/hw/net/vhost_net.c
> > > +++ b/hw/net/vhost_net.c
> > > @@ -159,6 +159,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > > *options)
> > >
> > > net->dev.nvqs = 2;
> > > net->dev.vqs = net->vqs;
> > > + net->dev.vq_index = net->nc->queue_index;
> > >
> > > r = vhost_dev_init(&net->dev, options->opaque,
> > > options->backend_type, options->force); @@
> > > -269,7 +270,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
> > > for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
> > > const VhostOps *vhost_ops = net->dev.vhost_ops;
> > > int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> > > - NULL);
> > > + &file);
> > > assert(r >= 0);
> > > }
> > > }
> > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index
> > > 27ba035..fb11d4c 100644
> > > --- a/hw/virtio/vhost-user.c
> > > +++ b/hw/virtio/vhost-user.c
> > > @@ -219,7 +219,12 @@ static int vhost_user_call(struct vhost_dev *dev,
> > unsigned long int request,
> > > break;
> > >
> > > case VHOST_USER_SET_OWNER:
> > > + break;
> > > +
> > > case VHOST_USER_RESET_OWNER:
> > > + memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > > + msg.state.index += dev->vq_index;
> > > + msg.size = sizeof(m.state);
> > > break;
> > >
> > > case VHOST_USER_SET_MEM_TABLE:
> > > @@ -262,17 +267,20 @@ static int vhost_user_call(struct vhost_dev *dev,
> > unsigned long int request,
> > > case VHOST_USER_SET_VRING_NUM:
> > > case VHOST_USER_SET_VRING_BASE:
> > > memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > > + msg.state.index += dev->vq_index;
> > > msg.size = sizeof(m.state);
> > > break;
> > >
> > > case VHOST_USER_GET_VRING_BASE:
> > > memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > > + msg.state.index += dev->vq_index;
> > > msg.size = sizeof(m.state);
> > > need_reply = 1;
> > > break;
> > >
> > > case VHOST_USER_SET_VRING_ADDR:
> > > memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
> > > + msg.addr.index += dev->vq_index;
> > > msg.size = sizeof(m.addr);
> > > break;
> > >
> > > @@ -280,7 +288,7 @@ static int vhost_user_call(struct vhost_dev *dev,
> > unsigned long int request,
> > > case VHOST_USER_SET_VRING_CALL:
> > > case VHOST_USER_SET_VRING_ERR:
> > > file = arg;
> > > - msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
> > > + msg.u64 = (file->index + dev->vq_index) &
> > > + VHOST_USER_VRING_IDX_MASK;
> > > msg.size = sizeof(m.u64);
> > > if (ioeventfd_enabled() && file->fd > 0) {
> > > fds[fd_num++] = file->fd; @@ -322,6 +330,7 @@ static int
> > > vhost_user_call(struct vhost_dev *dev, unsigned long int request,
> > > error_report("Received bad msg size.");
> > > return -1;
> > > }
> > > + msg.state.index -= dev->vq_index;
> > > memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
> > > break;
> > > default:
> > > diff --git a/net/vhost-user.c b/net/vhost-user.c index
> > > 1d86a2b..904d8af 100644
> > > --- a/net/vhost-user.c
> > > +++ b/net/vhost-user.c
> > > @@ -121,35 +121,39 @@ static void net_vhost_user_event(void *opaque,
> > int event)
> > > case CHR_EVENT_OPENED:
> > > vhost_user_start(s);
> > > net_vhost_link_down(s, false);
> > > - error_report("chardev \"%s\" went up", s->chr->label);
> > > + error_report("chardev \"%s\" went up", s->nc.info_str);
> > > break;
> > > case CHR_EVENT_CLOSED:
> > > net_vhost_link_down(s, true);
> > > vhost_user_stop(s);
> > > - error_report("chardev \"%s\" went down", s->chr->label);
> > > + error_report("chardev \"%s\" went down", s->nc.info_str);
> > > break;
> > > }
> > > }
BTW this seems pretty hacky: you get multiple messages when one
client connects. Why add multiple event listeners to the same
chat device?
> > >
> > > static int net_vhost_user_init(NetClientState *peer, const char *device,
> > > - const char *name, CharDriverState *chr)
> > > + const char *name, CharDriverState *chr,
> > > + uint32_t queues)
> > > {
> > > NetClientState *nc;
> > > VhostUserState *s;
> > > + int i;
> > >
> > > - nc = qemu_new_net_client(&net_vhost_user_info, peer, device,
> > name);
> > > + for (i = 0; i < queues; i++) {
> > > + nc = qemu_new_net_client(&net_vhost_user_info, peer, device,
> > > + name);
> > >
> > > - snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
> > > - chr->label);
> > > + snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to
> > > %s",
> > > + i, chr->label);
> > >
> > > - s = DO_UPCAST(VhostUserState, nc, nc);
> > > + s = DO_UPCAST(VhostUserState, nc, nc);
> > >
> > > - /* We don't provide a receive callback */
> > > - s->nc.receive_disabled = 1;
> > > - s->chr = chr;
> > > -
> > > - qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event,
> > s);
> > > + /* We don't provide a receive callback */
> > > + s->nc.receive_disabled = 1;
> > > + s->chr = chr;
> > > + s->nc.queue_index = i;
> > >
> > > + qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event,
> > s);
> > > + }
> > > return 0;
> > > }
> > >
> > > @@ -225,6 +229,7 @@ static int net_vhost_check_net(QemuOpts *opts,
> > > void *opaque)
> >
> >
> > There are two problems here:
> >
> > 1. we don't really know that the backend
> > is able to support the requested number of queues.
> > If not, everything will fail, silently.
> > A new message to query the # of queues could help, though
> > I'm not sure what can be done on failure. Fail connection?
> >
> > 2. each message (e.g. set memory table) is sent multiple times,
> > on the same socket.
> >
> I think it is tough to resolve these 2 comments, as the current message is
> either vhost-dev based or virt-queue based,
> The multiple queues(pair) feature use multiple vhost-devs to implement itself.
> For #1
> So the queue number is something should be seen in the upper level of
> vhost-dev rather than inside the vhost-dev.
> For each vhost-net, there are 2 virt-queues, one is for Rx the other is for
> Tx.
> introduce the virt-queue pair number into the vhost-dev? But I don't think it
> is good, as for each vhost-dev, there is only one
> virt-queue pair.
>
> Where should I put the virt-queue pair number to? I don't get the perfect
> answer till now. Any suggestion is welcome.
>
> Could we assume the vhost backend has the ability to create enough virt-queue
> pair(e.g. 0x8000 is the max) if qemu require
> Vhost backend to do it. If it is correct, we don't need get virt-queue pair
> number from vhost backend, as vhost backend can
> Create all virt-queue pair required by qemu.
> The virtio frontend(on guest) has the flexibility to enable which virt-queue
> according to its own capability, qemu can do it by using
> Set_vring_flag message to notify vhost backend.
I'm reluctant to agree to this. Implementations tend to get this wrong,
e.g. they would only test with 2 queues and assume everything is OK.
With an explicit message, this seems more robust.
Why is it so hard to implement? User specifies queues=X.
Can't we simply validate that backend supports this # of queues?
> For #2
> The memory table message is also vhost-dev based, it wouldn't hurt we send it
> a few times, vhost backend could
> Keep it vhost-dev based too, or keep it once(keep it when first time and
> ignore in rest messages from the same connected-fd)
> Any other good suggestion is welcome too :-)
Add code in vhost-user to skip sending the useless messages? Yes they
seem harmless but implementations tend to develop dependencies on such
bugs, then we get to maintain them forever.
> >
> >
> > > int net_init_vhost_user(const NetClientOptions *opts, const char *name,
> > > NetClientState *peer) {
> > > + uint32_t queues;
> > > const NetdevVhostUserOptions *vhost_user_opts;
> > > CharDriverState *chr;
> > >
> > > @@ -243,6 +248,12 @@ int net_init_vhost_user(const NetClientOptions
> > *opts, const char *name,
> > > return -1;
> > > }
> > >
> > > + /* number of queues for multiqueue */
> > > + if (vhost_user_opts->has_queues) {
> > > + queues = vhost_user_opts->queues;
> > > + } else {
> > > + queues = 1;
> > > + }
> > >
> > > - return net_vhost_user_init(peer, "vhost_user", name, chr);
> > > + return net_vhost_user_init(peer, "vhost_user", name, chr,
> > > + queues);
> > > }
> > > diff --git a/qapi-schema.json b/qapi-schema.json index
> > > f97ffa1..51e40ce 100644
> > > --- a/qapi-schema.json
> > > +++ b/qapi-schema.json
> > > @@ -2444,12 +2444,16 @@
> > > #
> > > # @vhostforce: #optional vhost on for non-MSIX virtio guests (default:
> > false).
> > > #
> > > +# @queues: #optional number of queues to be created for multiqueue
> > vhost-user
> > > +# (default: 1) (Since 2.5)
> > > +#
> > > # Since 2.1
> > > ##
> > > { 'struct': 'NetdevVhostUserOptions',
> > > 'data': {
> > > 'chardev': 'str',
> > > - '*vhostforce': 'bool' } }
> > > + '*vhostforce': 'bool',
> > > + '*queues': 'uint32' } }
> > >
> > > ##
> > > # @NetClientOptions
> > > diff --git a/qemu-options.hx b/qemu-options.hx index ec356f6..dad035e
> > > 100644
> > > --- a/qemu-options.hx
> > > +++ b/qemu-options.hx
> > > @@ -1942,13 +1942,14 @@ The hubport netdev lets you connect a NIC to a
> > > QEMU "vlan" instead of a single netdev. @code{-net} and
> > > @code{-device} with parameter @option{vlan} create the required hub
> > automatically.
> > >
> > > address@hidden -netdev vhost-user,address@hidden,vhostforce=on|off]
> > > address@hidden -netdev
> > > +vhost-user,address@hidden,vhostforce=on|off][,queues=n]
> > >
> > > Establish a vhost-user netdev, backed by a chardev @var{id}. The
> > > chardev should be a unix domain socket backed one. The vhost-user
> > > uses a specifically defined protocol to pass vhost ioctl replacement
> > > messages to an application on the other end of the socket. On
> > > non-MSIX guests, the feature can be forced with address@hidden
> > > address@hidden Use 'address@hidden' to specify the number of
> > > +queues to be created for multiqueue vhost-user.
> > >
> > > Example:
> > > @example
> > > --
> > > 1.8.4.2
> > >
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Snabb Switch development" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to address@hidden
> > To post to this group, send an email to address@hidden
> > Visit this group at http://groups.google.com/group/snabb-devel.
- [Qemu-devel] [PATCH v6 0/2] vhost-user multi queue support, Ouyang Changchun, 2015/08/12
- [Qemu-devel] [PATCH v6 1/2] vhost-user: add multi queue support, Ouyang Changchun, 2015/08/12
- Re: [Qemu-devel] [PATCH v6 1/2] vhost-user: add multi queue support, Michael S. Tsirkin, 2015/08/13
- Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support, Ouyang, Changchun, 2015/08/24
- Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support,
Michael S. Tsirkin <=
- Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support, Ouyang, Changchun, 2015/08/27
- Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support, Michael S. Tsirkin, 2015/08/30
- Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support, Ouyang, Changchun, 2015/08/31
- Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support, Michael S. Tsirkin, 2015/08/31
- Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support, Eric Blake, 2015/08/31
[Qemu-devel] [PATCH v6 2/2] vhost-user: new protocol feature for multi queue, Ouyang Changchun, 2015/08/12