qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PULL v2 04/36] virtio: Introduce started flag to Virti


From: Yongji Xie
Subject: Re: [Qemu-devel] [PULL v2 04/36] virtio: Introduce started flag to VirtioDevice
Date: Thu, 30 May 2019 08:39:13 +0800

On Wed, 29 May 2019 at 22:42, Dr. David Alan Gilbert
<address@hidden> wrote:
>
> * Yongji Xie (address@hidden) wrote:
> > On Wed, 29 May 2019 at 21:43, Dr. David Alan Gilbert
> > <address@hidden> wrote:
> > >
> > > * Greg Kurz (address@hidden) wrote:
> > > > On Wed, 29 May 2019 13:38:19 +0100
> > > > "Dr. David Alan Gilbert" <address@hidden> wrote:
> > > >
> > > > > * Greg Kurz (address@hidden) wrote:
> > > > > > On Wed, 29 May 2019 12:18:50 +0100
> > > > > > "Dr. David Alan Gilbert" <address@hidden> wrote:
> > > > > >
> > > > > > > * Greg Kurz (address@hidden) wrote:
> > > > > > > > On Tue, 28 May 2019 10:08:54 +1000
> > > > > > > > David Gibson <address@hidden> wrote:
> > > > > > > >
> > > > > > > > > On Fri, May 24, 2019 at 12:19:09PM +0200, Greg Kurz wrote:
> > > > > > > > > > On Mon, 20 May 2019 19:10:35 -0400
> > > > > > > > > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > > > > > > > >
> > > > > > > > > > > From: Xie Yongji <address@hidden>
> > > > > > > > > > >
> > > > > > > > > > > The virtio 1.0 transitional devices support driver uses 
> > > > > > > > > > > the device
> > > > > > > > > > > before setting the DRIVER_OK status bit. So we introduce 
> > > > > > > > > > > a started
> > > > > > > > > > > flag to indicate whether driver has started the device or 
> > > > > > > > > > > not.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Xie Yongji <address@hidden>
> > > > > > > > > > > Signed-off-by: Zhang Yu <address@hidden>
> > > > > > > > > > > Message-Id: <address@hidden>
> > > > > > > > > > > Reviewed-by: Michael S. Tsirkin <address@hidden>
> > > > > > > > > > > Signed-off-by: Michael S. Tsirkin <address@hidden>
> > > > > > > > > > > ---
> > > > > > > > > > >  include/hw/virtio/virtio.h |  2 ++
> > > > > > > > > > >  hw/virtio/virtio.c         | 52 
> > > > > > > > > > > ++++++++++++++++++++++++++++++++++++--
> > > > > > > > > > >  2 files changed, 52 insertions(+), 2 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/include/hw/virtio/virtio.h 
> > > > > > > > > > > b/include/hw/virtio/virtio.h
> > > > > > > > > > > index 7140381e3a..27c0efc3d0 100644
> > > > > > > > > > > --- a/include/hw/virtio/virtio.h
> > > > > > > > > > > +++ b/include/hw/virtio/virtio.h
> > > > > > > > > > > @@ -105,6 +105,8 @@ struct VirtIODevice
> > > > > > > > > > >      uint16_t device_id;
> > > > > > > > > > >      bool vm_running;
> > > > > > > > > > >      bool broken; /* device in invalid state, needs reset 
> > > > > > > > > > > */
> > > > > > > > > > > +    bool started;
> > > > > > > > > > > +    bool start_on_kick; /* virtio 1.0 transitional 
> > > > > > > > > > > devices support that */
> > > > > > > > > > >      VMChangeStateEntry *vmstate;
> > > > > > > > > > >      char *bus_name;
> > > > > > > > > > >      uint8_t device_endian;
> > > > > > > > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > > > > > > > > > index 28056a7ef7..5d533ac74e 100644
> > > > > > > > > > > --- a/hw/virtio/virtio.c
> > > > > > > > > > > +++ b/hw/virtio/virtio.c
> > > > > > > > > > > @@ -1162,10 +1162,16 @@ int 
> > > > > > > > > > > virtio_set_status(VirtIODevice *vdev, uint8_t val)
> > > > > > > > > > >              }
> > > > > > > > > > >          }
> > > > > > > > > > >      }
> > > > > > > > > > > +    vdev->started = val & VIRTIO_CONFIG_S_DRIVER_OK;
> > > > > > > > > > > +    if (unlikely(vdev->start_on_kick && vdev->started)) {
> > > > > > > > > > > +        vdev->start_on_kick = false;
> > > > > > > > > > > +    }
> > > > > > > > > > > +
> > > > > > > > > > >      if (k->set_status) {
> > > > > > > > > > >          k->set_status(vdev, val);
> > > > > > > > > > >      }
> > > > > > > > > > >      vdev->status = val;
> > > > > > > > > > > +
> > > > > > > > > > >      return 0;
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > > @@ -1208,6 +1214,9 @@ void virtio_reset(void *opaque)
> > > > > > > > > > >          k->reset(vdev);
> > > > > > > > > > >      }
> > > > > > > > > > >
> > > > > > > > > > > +    vdev->start_on_kick = (virtio_host_has_feature(vdev, 
> > > > > > > > > > > VIRTIO_F_VERSION_1) &&
> > > > > > > > > > > +                          !virtio_vdev_has_feature(vdev, 
> > > > > > > > > > > VIRTIO_F_VERSION_1));
> > > > > > > > > > > +    vdev->started = false;
> > > > > > > > > > >      vdev->broken = false;
> > > > > > > > > > >      vdev->guest_features = 0;
> > > > > > > > > > >      vdev->queue_sel = 0;
> > > > > > > > > > > @@ -1518,14 +1527,21 @@ void 
> > > > > > > > > > > virtio_queue_set_align(VirtIODevice *vdev, int n, int 
> > > > > > > > > > > align)
> > > > > > > > > > >
> > > > > > > > > > >  static bool virtio_queue_notify_aio_vq(VirtQueue *vq)
> > > > > > > > > > >  {
> > > > > > > > > > > +    bool ret = false;
> > > > > > > > > > > +
> > > > > > > > > > >      if (vq->vring.desc && vq->handle_aio_output) {
> > > > > > > > > > >          VirtIODevice *vdev = vq->vdev;
> > > > > > > > > > >
> > > > > > > > > > >          trace_virtio_queue_notify(vdev, vq - vdev->vq, 
> > > > > > > > > > > vq);
> > > > > > > > > > > -        return vq->handle_aio_output(vdev, vq);
> > > > > > > > > > > +        ret = vq->handle_aio_output(vdev, vq);
> > > > > > > > > > > +
> > > > > > > > > > > +        if (unlikely(vdev->start_on_kick)) {
> > > > > > > > > > > +            vdev->started = true;
> > > > > > > > > > > +            vdev->start_on_kick = false;
> > > > > > > > > > > +        }
> > > > > > > > > > >      }
> > > > > > > > > > >
> > > > > > > > > > > -    return false;
> > > > > > > > > > > +    return ret;
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > >  static void virtio_queue_notify_vq(VirtQueue *vq)
> > > > > > > > > > > @@ -1539,6 +1555,11 @@ static void 
> > > > > > > > > > > virtio_queue_notify_vq(VirtQueue *vq)
> > > > > > > > > > >
> > > > > > > > > > >          trace_virtio_queue_notify(vdev, vq - vdev->vq, 
> > > > > > > > > > > vq);
> > > > > > > > > > >          vq->handle_output(vdev, vq);
> > > > > > > > > > > +
> > > > > > > > > > > +        if (unlikely(vdev->start_on_kick)) {
> > > > > > > > > > > +            vdev->started = true;
> > > > > > > > > > > +            vdev->start_on_kick = false;
> > > > > > > > > > > +        }
> > > > > > > > > > >      }
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > > @@ -1556,6 +1577,11 @@ void 
> > > > > > > > > > > virtio_queue_notify(VirtIODevice *vdev, int n)
> > > > > > > > > > >      } else if (vq->handle_output) {
> > > > > > > > > > >          vq->handle_output(vdev, vq);
> > > > > > > > > > >      }
> > > > > > > > > > > +
> > > > > > > > > > > +    if (unlikely(vdev->start_on_kick)) {
> > > > > > > > > > > +        vdev->started = true;
> > > > > > > > > > > +        vdev->start_on_kick = false;
> > > > > > > > > > > +    }
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > >  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> > > > > > > > > > > @@ -1770,6 +1796,13 @@ static bool 
> > > > > > > > > > > virtio_broken_needed(void *opaque)
> > > > > > > > > > >      return vdev->broken;
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > > +static bool virtio_started_needed(void *opaque)
> > > > > > > > > > > +{
> > > > > > > > > > > +    VirtIODevice *vdev = opaque;
> > > > > > > > > > > +
> > > > > > > > > > > +    return vdev->started;
> > > > > > > > > >
> > > > > > > > > > Existing machine types don't know about the 
> > > > > > > > > > "virtio/started" subsection. This
> > > > > > > > > > breaks migration to older QEMUs if the driver has started 
> > > > > > > > > > the device, ie. most
> > > > > > > > > > probably always when it comes to live migration.
> > > > > > > > > >
> > > > > > > > > > My understanding is that we do try to support backward 
> > > > > > > > > > migration though. It
> > > > > > > > > > is a regular practice in datacenters to migrate workloads 
> > > > > > > > > > without having to
> > > > > > > > > > take care of the QEMU version. FWIW I had to fix similar 
> > > > > > > > > > issues downstream
> > > > > > > > > > many times in the past because customers had filed bugs.
> > > > > > > > > >
> > > > > > > > > > Cc'ing David for his opinion.
> > > > > > > > >
> > > > > > > > > Uh.. did you mean to CC me, or Dave Gilbert?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Oops... Dave Gilbert indeed, but you're thoughts on that matter 
> > > > > > > > are valuable
> > > > > > > > as well. I remember being involved in backward migration fixes 
> > > > > > > > for spapr
> > > > > > > > several times.
> > > > > > > >
> > > > > > > > > I mean, I think you're right that we should try to maintain 
> > > > > > > > > backwards
> > > > > > > > > migration, but this isn't really my area of authority.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Cc'ing Dave Gilbert :)
> > > > > > >
> > > > > > > Right, I need to maintain backwards migration compatibility; tie 
> > > > > > > the
> > > > > > > feature to a machine type so it's only used on newer machine 
> > > > > > > types and
> > > > > > > then we'll be safe.
> > > > > > >
> > > > > > > Having said that, what's the symptom when this goes wrong?
> > > > > > >
> > > > > >
> > > > > > Since the started flag is set as soon as the guest driver begins to 
> > > > > > use
> > > > > > the device and remains so until next reset, the associated 
> > > > > > subsection is
> > > > > > basically always emitted when migrating a booted guest. This causes
> > > > > > migration to always fail on the target in this case:
> > > > >
> > > > > I meant what's the symptom without this patch series at all?
> > > > >
> > > >
> > > > Oh sorry... if I got it right, migrating when the guest first kicked the
> > > > device but not set the DRIVE_OK bit yet would result in a guest hang on
> > > > the destination. Yongji might elaborate a bit more on that.
> > >
> > > Hmm - the only thing worse than a migration failing with an error is a
> > > migration that fails with a hung guest.
> > >
> > > If you were sending a new subsection in *only* the case where the guest
> > > would definitely hang, then I think it would be worth sending the
> > > subsection and allowing the backwards migration to fail with the error
> > > because it's a bit better than a hung VM.
> > >
> >
> > I considered this approach before. But it needs to add some complex
> > logic in virtio code to make sure migration works well between
> > different qemu versions. Since it's hardly to trigger the hung VM bug.
> > So I still try to use the general way (tie the feature to a machine
> > type) to fix this migration issue. It's easily to ensure the
> > compatibility
>
> OK, great.  For reference could you give me a bit more info on how
> exactly the failure happens, how you spot it and what the symptoms are;
> so that when someone comes to me with a hung VM I know what to look for?
>

Oh, sure. This failure would only happen on an old guest (e.g. centos
6u5, centos 6u3) with virtio-blk/vhost-user-blk device. This kind of
guest might be hung when migration completes at the point that guest
is booting (loading virtio-blk driver) because its first I/O would be
hung.

Thanks,
Yongji



reply via email to

[Prev in Thread] Current Thread [Next in Thread]