qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support
Date: Wed, 9 Sep 2015 15:18:53 +0300

On Tue, Sep 08, 2015 at 03:38:46PM +0800, Yuanhan Liu wrote:
> From: Ouyang Changchun <address@hidden>
> 
> This patch is initially based a patch from Nikolay Nikolaev.
> 
> Here is the latest version for adding vhost-user multiple queue support,
> by creating a nc and vhost_net pair for each queue.
> 
> What differs from last version is that this patch addresses two major
> concerns from Michael, and fixes one hidden bug.
> 
> - Concern #1: no feedback when the backend can't support # of
>   requested queues(by providing queues=# option).
> 
>   Here we address this issue by querying VHOST_USER_PROTOCOL_F_MQ
>   protocol features first, if not set, it means the backend don't
>   support mq feature, and let max_queues be 1. Otherwise, we send
>   another message, VHOST_USER_GET_QUEUE_NUM, for getting the max_queues
>   the backend supports.
> 
>   At vhost-user initiation stage(net_vhost_user_start), we then initiate
>   one queue first, which, in the meantime, also gets the max_queues.
>   We then do a simple compare: if requested_queues > max_queues, we
>   exit(I guess it's safe to exit here, as the vm is not running yet).
> 
> - Concern #2: some messages are sent more times than necessary.
> 
>   We came an agreement with Michael that we could categorize vhost
>   user messages to 2 types: none-vring specific messages, which should
>   be sent only once, and vring specific messages, which should be sent
>   per queue.
> 
>   Here I introduced a helper function vhost_user_one_time_request(),
>   which lists following messages as none-vring specific messages:
> 
>         VHOST_USER_GET_FEATURES
>         VHOST_USER_SET_FEATURES
>         VHOST_USER_GET_PROTOCOL_FEATURES
>         VHOST_USER_SET_PROTOCOL_FEATURES
>         VHOST_USER_SET_OWNER
>         VHOST_USER_RESET_DEVICE
>         VHOST_USER_SET_MEM_TABLE
>         VHOST_USER_GET_QUEUE_NUM
> 
>   For above messages, we simply ignore them when they are not sent the first
>   time.
> 
> I also observed a hidden bug from last version. We register the char dev
> event handler N times, which is not necessary, as well as buggy: A later
> register overwrites the former one, as qemu_chr_add_handlers() will not
> chain those handlers, but instead overwrites the old one. So, in theory,
> invoking qemu_chr_add_handlers N times will not end up with calling the
> handler N times.
> 
> However, the reason the handler is invoked N times is because we start
> the backend(the connection server) first, and hence when net_vhost_user_init()
> is executed, the connection is already established, and 
> qemu_chr_add_handlers()
> then invoke the handler immediately, which just looks like we invoke the
> handler(net_vhost_user_event) directly from net_vhost_user_init().
> 
> The solution I came up with is to make VhostUserState as an upper level
> structure, making it includes N nc and vhost_net pairs:
> 
>    struct VhostUserNetPeer {
>        NetClientState *nc;
>        VHostNetState  *vhost_net;
>    };
> 
>    typedef struct VhostUserState {
>        CharDriverState *chr;
>        bool running;
>        int queues;
>        struct VhostUserNetPeer peers[];
>    } VhostUserState;
> 
> Signed-off-by: Nikolay Nikolaev <address@hidden>
> Signed-off-by: Changchun Ouyang <address@hidden>
> Signed-off-by: Yuanhan Liu <address@hidden>
> ---
>  docs/specs/vhost-user.txt |  13 +++++
>  hw/virtio/vhost-user.c    |  31 ++++++++++-
>  include/net/net.h         |   1 +
>  net/vhost-user.c          | 136 
> ++++++++++++++++++++++++++++++++--------------
>  qapi-schema.json          |   6 +-
>  qemu-options.hx           |   5 +-
>  6 files changed, 146 insertions(+), 46 deletions(-)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 43db9b4..99d79be 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -135,6 +135,19 @@ As older slaves don't support negotiating protocol 
> features,
>  a feature bit was dedicated for this purpose:
>  #define VHOST_USER_F_PROTOCOL_FEATURES 30
>  
> +Multiple queue support
> +-------------------
> +Multiple queue is treated as a protocal extension, hence the slave has to
> +implement protocol features first. Multiple queues is supported only when
> +the protocol feature VHOST_USER_PROTOCOL_F_MQ(bit 0) is set.
> +
> +The max # of queues the slave support can be queried with message
> +VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the # of requested
> +queues is bigger than that.
> +
> +As all queues share one connection, the master use a unique index for each
> +queue in the sent message to identify one specified queue.
> +

Please also document that only 2 queues are enabled initially.
More queues are enabled dynamically.
To enable more queues, the new message needs to be sent
(added by patch 7).

>  Message types
>  -------------
>  
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 8046bc0..11e46b5 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -187,6 +187,23 @@ static int vhost_user_write(struct vhost_dev *dev, 
> VhostUserMsg *msg,
>              0 : -1;
>  }
>  
> +static bool vhost_user_one_time_request(VhostUserRequest request)
> +{
> +    switch (request) {
> +    case VHOST_USER_GET_FEATURES:
> +    case VHOST_USER_SET_FEATURES:
> +    case VHOST_USER_GET_PROTOCOL_FEATURES:
> +    case VHOST_USER_SET_PROTOCOL_FEATURES:
> +    case VHOST_USER_SET_OWNER:
> +    case VHOST_USER_RESET_DEVICE:
> +    case VHOST_USER_SET_MEM_TABLE:
> +    case VHOST_USER_GET_QUEUE_NUM:
> +        return true;
> +    default:
> +        return false;
> +    }
> +}
> +
>  static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
>          void *arg)
>  {
> @@ -206,6 +223,14 @@ static int vhost_user_call(struct vhost_dev *dev, 
> unsigned long int request,
>      else
>          msg_request = request;
>  
> +    /*
> +     * For none-vring specific requests, like VHOST_USER_GET_FEATURES,
> +     * we just need send it once in the first time. For later such
> +     * request, we just ignore it.
> +     */
> +    if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0)
> +        return 0;
> +
>      msg.request = msg_request;
>      msg.flags = VHOST_USER_VERSION;
>      msg.size = 0;
> @@ -268,17 +293,20 @@ static int vhost_user_call(struct vhost_dev *dev, 
> unsigned long int request,
>      case VHOST_USER_SET_VRING_NUM:
>      case VHOST_USER_SET_VRING_BASE:
>          memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> +        msg.addr.index += dev->vq_index;
>          msg.size = sizeof(m.state);
>          break;
>  
>      case VHOST_USER_GET_VRING_BASE:
>          memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> +        msg.addr.index += dev->vq_index;
>          msg.size = sizeof(m.state);
>          need_reply = 1;
>          break;
>  
>      case VHOST_USER_SET_VRING_ADDR:
>          memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
> +        msg.addr.index += dev->vq_index;
>          msg.size = sizeof(m.addr);
>          break;
>  
> @@ -286,7 +314,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
> unsigned long int request,
>      case VHOST_USER_SET_VRING_CALL:
>      case VHOST_USER_SET_VRING_ERR:
>          file = arg;
> -        msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
> +        msg.u64 = (file->index + dev->vq_index) & VHOST_USER_VRING_IDX_MASK;
>          msg.size = sizeof(m.u64);
>          if (ioeventfd_enabled() && file->fd > 0) {
>              fds[fd_num++] = file->fd;
> @@ -330,6 +358,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
> unsigned long int request,
>                  error_report("Received bad msg size.");
>                  return -1;
>              }
> +            msg.state.index -= dev->vq_index;
>              memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
>              break;
>          default:
> diff --git a/include/net/net.h b/include/net/net.h
> index 6a6cbef..6f20656 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -92,6 +92,7 @@ struct NetClientState {
>      NetClientDestructor *destructor;
>      unsigned int queue_index;
>      unsigned rxfilter_notify_enabled:1;
> +    void *opaque;
>  };
>  
>  typedef struct NICState {
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index 2d6bbe5..7d4ac69 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -15,10 +15,16 @@
>  #include "qemu/config-file.h"
>  #include "qemu/error-report.h"
>  
> +struct VhostUserNetPeer {
> +    NetClientState *nc;
> +    VHostNetState  *vhost_net;
> +};
> +
>  typedef struct VhostUserState {
> -    NetClientState nc;
>      CharDriverState *chr;
> -    VHostNetState *vhost_net;
> +    bool running;
> +    int queues;
> +    struct VhostUserNetPeer peers[];
>  } VhostUserState;
>  
>  typedef struct VhostUserChardevProps {
> @@ -29,48 +35,75 @@ typedef struct VhostUserChardevProps {
>  
>  VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
>  {
> -    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
> +    VhostUserState *s = nc->opaque;
>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
> -    return s->vhost_net;
> -}
> -
> -static int vhost_user_running(VhostUserState *s)
> -{
> -    return (s->vhost_net) ? 1 : 0;
> +    return s->peers[nc->queue_index].vhost_net;
>  }
>  
>  static int vhost_user_start(VhostUserState *s)
>  {
>      VhostNetOptions options;
> +    VHostNetState *vhost_net;
> +    int max_queues;
> +    int i = 0;
>  
> -    if (vhost_user_running(s)) {
> +    if (s->running)
>          return 0;
> -    }
>  
>      options.backend_type = VHOST_BACKEND_TYPE_USER;
> -    options.net_backend = &s->nc;
>      options.opaque = s->chr;
>  
> -    s->vhost_net = vhost_net_init(&options);
> +    options.net_backend = s->peers[i].nc;
> +    vhost_net = s->peers[i++].vhost_net = vhost_net_init(&options);
> +
> +    max_queues = vhost_net_get_max_queues(vhost_net);
> +    if (s->queues >= max_queues) {
> +        error_report("you are asking more queues than supported: %d\n",
> +                     max_queues);
> +        return -1;
> +    }
> +
> +    for (; i < s->queues; i++) {
> +        options.net_backend = s->peers[i].nc;
> +
> +        s->peers[i].vhost_net = vhost_net_init(&options);
> +        if (!s->peers[i].vhost_net)
> +            return -1;
> +    }
> +    s->running = true;
>  
> -    return vhost_user_running(s) ? 0 : -1;
> +    return 0;
>  }
>  
>  static void vhost_user_stop(VhostUserState *s)
>  {
> -    if (vhost_user_running(s)) {
> -        vhost_net_cleanup(s->vhost_net);
> +    int i;
> +    VHostNetState *vhost_net;
> +
> +    if (!s->running)
> +        return;
> +
> +    for (i = 0;  i < s->queues; i++) {
> +        vhost_net = s->peers[i].vhost_net;
> +        if (vhost_net)
> +            vhost_net_cleanup(vhost_net);
>      }
>  
> -    s->vhost_net = 0;
> +    s->running = false;
>  }
>  
>  static void vhost_user_cleanup(NetClientState *nc)
>  {
> -    VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
> +    VhostUserState *s = nc->opaque;
> +    VHostNetState *vhost_net = s->peers[nc->queue_index].vhost_net;
> +
> +    if (vhost_net)
> +        vhost_net_cleanup(vhost_net);
>  
> -    vhost_user_stop(s);
>      qemu_purge_queued_packets(nc);
> +
> +    if (nc->queue_index == s->queues - 1)
> +        free(s);
>  }
>  
>  static bool vhost_user_has_vnet_hdr(NetClientState *nc)
> @@ -89,7 +122,7 @@ static bool vhost_user_has_ufo(NetClientState *nc)
>  
>  static NetClientInfo net_vhost_user_info = {
>          .type = NET_CLIENT_OPTIONS_KIND_VHOST_USER,
> -        .size = sizeof(VhostUserState),
> +        .size = sizeof(NetClientState),
>          .cleanup = vhost_user_cleanup,
>          .has_vnet_hdr = vhost_user_has_vnet_hdr,
>          .has_ufo = vhost_user_has_ufo,
> @@ -97,18 +130,25 @@ static NetClientInfo net_vhost_user_info = {
>  
>  static void net_vhost_link_down(VhostUserState *s, bool link_down)
>  {
> -    s->nc.link_down = link_down;
> +    NetClientState *nc;
> +    int i;
>  
> -    if (s->nc.peer) {
> -        s->nc.peer->link_down = link_down;
> -    }
> +    for (i = 0; i < s->queues; i++) {
> +        nc = s->peers[i].nc;
>  
> -    if (s->nc.info->link_status_changed) {
> -        s->nc.info->link_status_changed(&s->nc);
> -    }
> +        nc->link_down = link_down;
> +
> +        if (nc->peer) {
> +            nc->peer->link_down = link_down;
> +        }
>  
> -    if (s->nc.peer && s->nc.peer->info->link_status_changed) {
> -        s->nc.peer->info->link_status_changed(s->nc.peer);
> +        if (nc->info->link_status_changed) {
> +            nc->info->link_status_changed(nc);
> +        }
> +
> +        if (nc->peer && nc->peer->info->link_status_changed) {
> +            nc->peer->info->link_status_changed(nc->peer);
> +        }
>      }
>  }
>  
> @@ -118,7 +158,8 @@ static void net_vhost_user_event(void *opaque, int event)
>  
>      switch (event) {
>      case CHR_EVENT_OPENED:
> -        vhost_user_start(s);
> +        if (vhost_user_start(s) < 0)
> +            exit(1);
>          net_vhost_link_down(s, false);
>          error_report("chardev \"%s\" went up", s->chr->label);
>          break;
> @@ -131,24 +172,28 @@ static void net_vhost_user_event(void *opaque, int 
> event)
>  }
>  
>  static int net_vhost_user_init(NetClientState *peer, const char *device,
> -                               const char *name, CharDriverState *chr)
> +                               const char *name, VhostUserState *s)
>  {
>      NetClientState *nc;
> -    VhostUserState *s;
> +    CharDriverState *chr = s->chr;
> +    int i;
>  
> -    nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
> +    for (i = 0; i < s->queues; i++) {
> +        nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
>  
> -    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
> -             chr->label);
> +        snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
> +                 i, chr->label);
>  
> -    s = DO_UPCAST(VhostUserState, nc, nc);
> +        /* We don't provide a receive callback */
> +        nc->receive_disabled = 1;
>  
> -    /* We don't provide a receive callback */
> -    s->nc.receive_disabled = 1;
> -    s->chr = chr;
> -    nc->queue_index = 0;
> +        nc->queue_index = i;
> +        nc->opaque      = s;
>  
> -    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
> +        s->peers[i].nc = nc;
> +    }
> +
> +    qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, s);
>  
>      return 0;
>  }
> @@ -227,8 +272,10 @@ static int net_vhost_check_net(void *opaque, QemuOpts 
> *opts, Error **errp)
>  int net_init_vhost_user(const NetClientOptions *opts, const char *name,
>                          NetClientState *peer, Error **errp)
>  {
> +    int queues;
>      const NetdevVhostUserOptions *vhost_user_opts;
>      CharDriverState *chr;
> +    VhostUserState *s;
>  
>      assert(opts->kind == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
>      vhost_user_opts = opts->vhost_user;
> @@ -244,6 +291,11 @@ int net_init_vhost_user(const NetClientOptions *opts, 
> const char *name,
>          return -1;
>      }
>  
> +    queues = vhost_user_opts->has_queues ? vhost_user_opts->queues : 1;
> +    s = g_malloc0(sizeof(VhostUserState) +
> +                  queues * sizeof(struct VhostUserNetPeer));
> +    s->queues = queues;
> +    s->chr    = chr;
>  
> -    return net_vhost_user_init(peer, "vhost_user", name, chr);
> +    return net_vhost_user_init(peer, "vhost_user", name, s);
>  }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 67fef37..55c33db 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2480,12 +2480,16 @@
>  #
>  # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: 
> false).
>  #
> +# @queues: #optional number of queues to be created for multiqueue vhost-user
> +#          (default: 1) (Since 2.5)
> +#
>  # Since 2.1
>  ##
>  { 'struct': 'NetdevVhostUserOptions',
>    'data': {
>      'chardev':        'str',
> -    '*vhostforce':    'bool' } }
> +    '*vhostforce':    'bool',
> +    '*queues':        'int' } }
>  
>  ##
>  # @NetClientOptions
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 77f5853..5bfa7a3 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1963,13 +1963,14 @@ The hubport netdev lets you connect a NIC to a QEMU 
> "vlan" instead of a single
>  netdev.  @code{-net} and @code{-device} with parameter @option{vlan} create 
> the
>  required hub automatically.
>  
> address@hidden -netdev vhost-user,address@hidden,vhostforce=on|off]
> address@hidden -netdev vhost-user,address@hidden,vhostforce=on|off][,queues=n]
>  
>  Establish a vhost-user netdev, backed by a chardev @var{id}. The chardev 
> should
>  be a unix domain socket backed one. The vhost-user uses a specifically 
> defined
>  protocol to pass vhost ioctl replacement messages to an application on the 
> other
>  end of the socket. On non-MSIX guests, the feature can be forced with
> address@hidden
> address@hidden Use 'address@hidden' to specify the number of queues to
> +be created for multiqueue vhost-user.
>  
>  Example:
>  @example
> -- 
> 1.9.0
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]