[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect
From: |
Yongji Xie |
Subject: |
Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend |
Date: |
Fri, 15 Mar 2019 20:37:04 +0800 |
On Fri, 15 Mar 2019 at 18:41, Yury Kotov <address@hidden> wrote:
>
> 15.03.2019, 12:46, "Daniel P. Berrangé" <address@hidden>:
> > On Thu, Mar 14, 2019 at 03:31:47PM +0300, Yury Kotov wrote:
> >> Hi,
> >>
> >> 14.03.2019, 14:44, "Daniel P. Berrangé" <address@hidden>:
> >> > On Thu, Mar 14, 2019 at 07:34:03AM -0400, Michael S. Tsirkin wrote:
> >> >> On Thu, Mar 14, 2019 at 11:24:22AM +0000, Daniel P. Berrangé wrote:
> >> >> > On Tue, Mar 12, 2019 at 12:49:35PM -0400, Michael S. Tsirkin wrote:
> >> >> > > On Thu, Feb 28, 2019 at 04:53:54PM +0800, address@hidden wrote:
> >> >> > > > From: Xie Yongji <address@hidden>
> >> >> > > >
> >> >> > > > Since we now support the message VHOST_USER_GET_INFLIGHT_FD
> >> >> > > > and VHOST_USER_SET_INFLIGHT_FD. The backend is able to restart
> >> >> > > > safely because it can track inflight I/O in shared memory.
> >> >> > > > This patch allows qemu to reconnect the backend after
> >> >> > > > connection closed.
> >> >> > > >
> >> >> > > > Signed-off-by: Xie Yongji <address@hidden>
> >> >> > > > Signed-off-by: Ni Xun <address@hidden>
> >> >> > > > Signed-off-by: Zhang Yu <address@hidden>
> >> >> > > > ---
> >> >> > > > hw/block/vhost-user-blk.c | 205 +++++++++++++++++++++++------
> >> >> > > > include/hw/virtio/vhost-user-blk.h | 4 +
> >> >> > > > 2 files changed, 167 insertions(+), 42 deletions(-)
> >> >> >
> >> >> >
> >> >> > > > static void vhost_user_blk_device_realize(DeviceState *dev,
> >> Error **errp)
> >> >> > > > {
> >> >> > > > VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >> >> > > > VHostUserBlk *s = VHOST_USER_BLK(vdev);
> >> >> > > > VhostUserState *user;
> >> >> > > > - struct vhost_virtqueue *vqs = NULL;
> >> >> > > > int i, ret;
> >> >> > > > + Error *err = NULL;
> >> >> > > >
> >> >> > > > if (!s->chardev.chr) {
> >> >> > > > error_setg(errp, "vhost-user-blk: chardev is mandatory");
> >> >> > > > @@ -312,27 +442,28 @@ static void
> >> vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> >> >> > > > }
> >> >> > > >
> >> >> > > > s->inflight = g_new0(struct vhost_inflight, 1);
> >> >> > > > -
> >> >> > > > - s->dev.nvqs = s->num_queues;
> >> >> > > > - s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
> >> >> > > > - s->dev.vq_index = 0;
> >> >> > > > - s->dev.backend_features = 0;
> >> >> > > > - vqs = s->dev.vqs;
> >> >> > > > -
> >> >> > > > - vhost_dev_set_config_notifier(&s->dev, &blk_ops);
> >> >> > > > -
> >> >> > > > - ret = vhost_dev_init(&s->dev, s->vhost_user,
> >> VHOST_BACKEND_TYPE_USER, 0);
> >> >> > > > - if (ret < 0) {
> >> >> > > > - error_setg(errp, "vhost-user-blk: vhost initialization
> >> failed: %s",
> >> >> > > > - strerror(-ret));
> >> >> > > > - goto virtio_err;
> >> >> > > > - }
> >> >> > > > + s->vqs = g_new(struct vhost_virtqueue, s->num_queues);
> >> >> > > > + s->watch = 0;
> >> >> > > > + s->should_start = false;
> >> >> > > > + s->connected = false;
> >> >> > > > +
> >> >> > > > + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
> >> vhost_user_blk_event,
> >> >> > > > + NULL, (void *)dev, NULL, true);
> >> >> > > > +
> >> >> > > > +reconnect:
> >> >> > > > + do {
> >> >> > > > + if (qemu_chr_fe_wait_connected(&s->chardev, &err) < 0) {
> >> >> > > > + error_report_err(err);
> >> >> > > > + err = NULL;
> >> >> > > > + sleep(1);
> >> >> > >
> >> >> > > Seems arbitrary. Is this basically waiting until backend will
> >> reconnect?
> >> >> > > Why not block until event on the fd triggers?
> >> >> > >
> >> >> > > Also, it looks like this will just block forever with no monitor
> >> input
> >> >> > > and no way for user to figure out what is going on short of
> >> >> > > crashing QEMU.
> >> >> >
> >> >> > FWIW, the current vhost-user-net device does exactly the same thing
> >> >> > with calling qemu_chr_fe_wait_connected during its realize()
> >> function.
> >> >>
> >> >> Hmm yes. It doesn't sleep for an arbitrary 1 sec so less of an
> >> eyesore :)
> >> >
> >> > The sleep(1) in this patch simply needs to be removed. I think that
> >> > probably dates from when it was written against the earlier broken
> >> > version of qemu_chr_fe_wait_connected(). That would not correctly
> >> > deal with the "reconnect" flag, and so needing this loop with a sleep
> >> > in it.
> >> >
> >> > In fact the while loop can be removed as well in this code. It just
> >> > needs to call qemu_chr_fe_wait_connected() once. It is guaranteed
> >> > to have a connected peer once that returns 0.
> >> >
> >> > qemu_chr_fe_wait_connected() only returns -1 if the operating in
> >> > client mode, and it failed to connect and reconnect is *not*
> >> > requested. In such case the caller should honour the failure and
> >> > quit, not loop to retry.
> >> >
> >> > The reason vhost-user-net does a loop is because once it has a
> >> > connection it tries todo a protocol handshake, and if that
> >> > handshake fails it closes the chardev and tries to connect
> >> > again. That's not the case in this blk code os the loop is
> >> > not needed.
> >> >
> >>
> >> But vhost-user-blk also has a handshake in device realize. What happens
> >> if the
> >> connection is broken during realization? IIUC we have to retry a
> >> handshake in
> >> such case just like vhost-user-net.
> >
> > I'm just commenting on the current code which does not do that
> > handshake in the loop afaict. If it needs to do that then the
> > patch should be updated...
> >
>
> Oh, yes... This loop doesn't do a handshake. Handshake is after the loop.
> But now it gotos to reconnect. So may be it makes sense to rewrite a handshake
> since we don't need two nested loops to get reconnection without gotos.
>
Actually we do a handshake in loop like this:
qemu_chr_fe_wait_connected()
tcp_chr_wait_connected()
tcp_chr_connect_client_sync()
tcp_chr_new_client()
qemu_chr_be_event(chr, CHR_EVENT_OPENED);
vhost_user_blk_event()
vhost_user_blk_connect()
vhost_dev_init()
Then I use s->connected to check the result of vhost_dev_init().
Thanks,
Yongji
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, (continued)
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Yongji Xie, 2019/03/12
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Daniel P . Berrangé, 2019/03/14
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Michael S. Tsirkin, 2019/03/14
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Daniel P . Berrangé, 2019/03/14
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Michael S. Tsirkin, 2019/03/14
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Yury Kotov, 2019/03/14
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Yongji Xie, 2019/03/14
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Daniel P . Berrangé, 2019/03/15
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Yury Kotov, 2019/03/15
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend,
Yongji Xie <=
- Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Yongji Xie, 2019/03/14
Re: [Qemu-devel] [PATCH v7 6/7] vhost-user-blk: Add support to reconnect backend, Michael S. Tsirkin, 2019/03/12