[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] block/vhost-user-blk: Fix hang on boot for some odd guests
From: |
Michael S. Tsirkin |
Subject: |
Re: [PATCH] block/vhost-user-blk: Fix hang on boot for some odd guests |
Date: |
Tue, 18 Apr 2023 02:17:22 -0400 |
On Tue, Apr 18, 2023 at 05:13:11AM +0000, Raphael Norwitz wrote:
> Hey Andrey - apologies for the late reply here.
>
> It sounds like you are dealing with a buggy guest, rather than a QEMU issue.
>
> > On Apr 10, 2023, at 11:39 AM, Andrey Ryabinin <arbn@yandex-team.com> wrote:
> >
> >
> >
> > On 4/10/23 10:35, Andrey Ryabinin wrote:
> >> Some guests hang on boot when using the vhost-user-blk-pci device,
> >> but boot normally when using the virtio-blk device. The problem occurs
> >> because the guest advertises VIRTIO_F_VERSION_1 but kicks the virtqueue
> >> before setting VIRTIO_CONFIG_S_DRIVER_OK, causing vdev->start_on_kick to
>
> Virtio 1.1 Section 3.1.1, says during setup “[t]he driver MUST NOT notify the
> device before setting DRIVER_OK.”
>
> Therefore what you are describing is buggy guest behavior. Sounds like the
> driver should be made to either
> - not advertise VIRTIO_F_VERSION_1
> - not kick before setting VIRTIO_CONFIG_S_DRIVER_OK
>
> If anything, the virtio-blk virtio_blk_handle_output() function should
> probably check start_on_kick?
Question is, how easy is this guest to fix.
> >> be false in vhost_user_blk_handle_output() and preventing the device from
> >> starting.
> >>
> >> Fix this by removing the check for vdev->start_on_kick to ensure
> >> that the device starts after the kick. This aligns the behavior of
> >> 'vhost-user-blk-pci' device with 'virtio-blk' as it does the similar
> >> thing in its virtio_blk_handle_output() function.
> >>
> >> Fixes: 110b9463d5c8 ("vhost-user-blk: start vhost when guest kicks")
> >> Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
> >> ---
> >> hw/block/vhost-user-blk.c | 4 ----
> >> 1 file changed, 4 deletions(-)
> >>
> >> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> >> index aff4d2b8cbd..448ead448f3 100644
> >> --- a/hw/block/vhost-user-blk.c
> >> +++ b/hw/block/vhost-user-blk.c
> >> @@ -279,10 +279,6 @@ static void vhost_user_blk_handle_output(VirtIODevice
> >> *vdev, VirtQueue *vq)
> >> Error *local_err = NULL;
> >> int i, ret;
> >>
> >> - if (!vdev->start_on_kick) {
> >> - return;
> >> - }
> >> -
> >> if (!s->connected) {
> >> return;
> >> }
> >
> >
> > After looking a bit closer to this ->start_on_kick thing ( commit
> > badaf79cfdbd ("virtio: Introduce started flag to VirtioDevice")
> > and follow ups) I'm starting to think that removing it entirely would be
> > the right thing to do here.
> > The whole reason for it was to add special case for !VIRTIO_F_VERSION_1
> > guests.
>
> The virtio 1.0 spec section 2.1.2 explicitly says: "The device MUST NOT
> consume buffers or notify the driver before DRIVER_OK.”
>
> Your change here would make QEMU violate this condition. I don’t know what
> the danger is but I assume that wording is there for a reason.
>
> Unless MST or Cornellia (CCed) say otherwise I don’t think this is the
> correct approach.
If we propagate HV workarounds then guests do not get fixed.
> > If we making start on kick thing for misbehaving VIRTIO_F_VERSION_1 guests
> > too, than the flag is no longer required,
> > so we can do following:
> >
> > ---
> > hw/block/vhost-user-blk.c | 4 ----
> > hw/virtio/virtio-qmp.c | 2 +-
> > hw/virtio/virtio.c | 21 ++-------------------
> > include/hw/virtio/virtio.h | 5 -----
> > 4 files changed, 3 insertions(+), 29 deletions(-)
> >
> > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> > index aff4d2b8cbd..448ead448f3 100644
> > --- a/hw/block/vhost-user-blk.c
> > +++ b/hw/block/vhost-user-blk.c
> > @@ -279,10 +279,6 @@ static void vhost_user_blk_handle_output(VirtIODevice
> > *vdev, VirtQueue *vq)
> > Error *local_err = NULL;
> > int i, ret;
> >
> > - if (!vdev->start_on_kick) {
> > - return;
> > - }
> > -
> > if (!s->connected) {
> > return;
> > }
> > diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
> > index e4d4bece2d7..4865819cd2f 100644
> > --- a/hw/virtio/virtio-qmp.c
> > +++ b/hw/virtio/virtio-qmp.c
> > @@ -773,7 +773,7 @@ VirtioStatus *qmp_x_query_virtio_status(const char
> > *path, Error **errp)
> > status->disabled = vdev->disabled;
> > status->use_started = vdev->use_started;
> > status->started = vdev->started;
> > - status->start_on_kick = vdev->start_on_kick;
> > + status->start_on_kick = true;
> > status->disable_legacy_check = vdev->disable_legacy_check;
> > status->bus_name = g_strdup(vdev->bus_name);
> > status->use_guest_notifier_mask = vdev->use_guest_notifier_mask;
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index f35178f5fcd..218584eae85 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -2126,7 +2126,6 @@ void virtio_reset(void *opaque)
> > k->reset(vdev);
> > }
> >
> > - vdev->start_on_kick = false;
> > vdev->started = false;
> > vdev->broken = false;
> > vdev->guest_features = 0;
> > @@ -2248,9 +2247,7 @@ static void virtio_queue_notify_vq(VirtQueue *vq)
> > trace_virtio_queue_notify(vdev, vq - vdev->vq, vq);
> > vq->handle_output(vdev, vq);
> >
> > - if (unlikely(vdev->start_on_kick)) {
> > - virtio_set_started(vdev, true);
> > - }
> > + virtio_set_started(vdev, true);
> > }
> > }
> >
> > @@ -2268,9 +2265,7 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
> > } else if (vq->handle_output) {
> > vq->handle_output(vdev, vq);
> >
> > - if (unlikely(vdev->start_on_kick)) {
> > - virtio_set_started(vdev, true);
> > - }
> > + virtio_set_started(vdev, true);
> > }
> > }
> >
> > @@ -2881,12 +2876,6 @@ int virtio_set_features(VirtIODevice *vdev, uint64_t
> > val)
> > }
> > }
> > }
> > - if (!ret) {
> > - if (!virtio_device_started(vdev, vdev->status) &&
> > - !virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> > - vdev->start_on_kick = true;
> > - }
> > - }
> > return ret;
> > }
> >
> > @@ -3039,11 +3028,6 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int
> > version_id)
> > }
> > }
> >
> > - if (!virtio_device_started(vdev, vdev->status) &&
> > - !virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> > - vdev->start_on_kick = true;
> > - }
> > -
> > RCU_READ_LOCK_GUARD();
> > for (i = 0; i < num; i++) {
> > if (vdev->vq[i].vring.desc) {
> > @@ -3162,7 +3146,6 @@ void virtio_init(VirtIODevice *vdev, uint16_t
> > device_id, size_t config_size)
> > g_malloc0(sizeof(*vdev->vector_queues) * nvectors);
> > }
> >
> > - vdev->start_on_kick = false;
> > vdev->started = false;
> > vdev->vhost_started = false;
> > vdev->device_id = device_id;
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index 77c6c55929f..5742876b4fa 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -144,7 +144,6 @@ struct VirtIODevice
> > */
> > bool use_started;
> > bool started;
> > - bool start_on_kick; /* when virtio 1.0 feature has not been negotiated
> > */
> > bool disable_legacy_check;
> > bool vhost_started;
> > VMChangeStateEntry *vmstate;
> > @@ -460,10 +459,6 @@ static inline bool
> > virtio_device_should_start(VirtIODevice *vdev, uint8_t status
> >
> > static inline void virtio_set_started(VirtIODevice *vdev, bool started)
> > {
> > - if (started) {
> > - vdev->start_on_kick = false;
> > - }
> > -
> > if (vdev->use_started) {
> > vdev->started = started;
> > }
> > --
> > 2.39.2
>