qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03


From: Andrey Korolyov
Subject: Re: [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03
Date: Wed, 3 Sep 2014 11:43:54 +0400

On Wed, Sep 3, 2014 at 10:10 AM, Michael S. Tsirkin <address@hidden> wrote:
> On Wed, Sep 03, 2014 at 02:17:02AM +0400, Andrey Korolyov wrote:
>> On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov <address@hidden> wrote:
>> > On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin <address@hidden> wrote:
>> >> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
>> >>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin <address@hidden> 
>> >>> wrote:
>> >>> >> bad one is the
>> >>> >>
>> >>> >> Author: Jason Wang <address@hidden>
>> >>> >> Date:   Tue Sep 2 18:07:46 2014 +0300
>> >>> >>
>> >>> >>     vhost_net: start/stop guest notifiers properly
>> >>> >
>> >>> >
>> >>> >
>> >>> > upstream has this (pull request sent today):
>> >>> > vhost_net: cleanup start/stop condition
>> >>> >
>> >>> > Could you apply it and see if it helps please?
>> >>> >
>> >>> > Michael, if it helps it should be before start/stop guest notifiers
>> >>> > ideally to avoid bisect problems.
>> >>>
>> >>> It is already applied as shown from the list in the previous message
>> >>> (there are some aio fixes too on top of 2.1 I picked before but they
>> >>> should not impact vhost-net interaction in any mean). The symptoms are
>> >>> a bit interesting - VM crashes only at PCI device initalization (e.g.
>> >>> grub stage after reset and initrd unpacking are passing well, but then
>> >>> things getting ugly). I am running 3.14 guest i686-pae kernel from
>> >>> debian backports in guest, so it may be version-specific after all. If
>> >>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
>> >>> Please find args in attached file.
>> >>
>> >>
>> >>
>> >> ok just to make sure - which tree do I clone exactly?
>> >>
>> >
>> > https://github.com/mdroth/qemu.git stable-2.1-staging showing same
>> > behavior for me with those patches
>>
>> Forgot to mention important detail - I am playing with -mq now, so
>> actually virtio-net working in a bit different way than it may
>> expected (it also shown in args list from above, but someone may miss
>> it):
>> ...
>> qemu-system-x86_64: unable to start vhost net: 95: falling back on
>> userspace virtio
>> qemu-system-x86_64: unable to start vhost net: 95: falling back on
>> userspace virtio
>> ...
>
>
> OK I see at least one obvious bug there: does the following fix the
> crash for you?
> Separately, we need to debug why mq vhost is broken for you.
> Is this a regression?
>
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index ba5d544..1fe18c7 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -289,7 +289,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState 
> *ncs,
>      BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
>      VirtioBusState *vbus = VIRTIO_BUS(qbus);
>      VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> -    int r, i = 0;
> +    int r, i;
>
>      if (!vhost_net_device_endian_ok(dev)) {
>          error_report("vhost-net does not support cross-endian");
> @@ -317,16 +317,22 @@ int vhost_net_start(VirtIODevice *dev, NetClientState 
> *ncs,
>          r = vhost_net_start_one(get_vhost_net(ncs[i].peer), dev);
>
>          if (r < 0) {
> -            goto err;
> +            goto err_start;
>          }
>      }
>
>      return 0;
>
> -err:
> +err_start:
>      while (--i >= 0) {
>          vhost_net_stop_one(get_vhost_net(ncs[i].peer), dev);
>      }
> +err:
> +    r = k->set_guest_notifiers(qbus->parent, total_queues * 2, false);
> +    if (r < 0) {
> +        fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r);
> +        fflush(stderr);
> +    }
>      return r;
>  }
>


another bits of information:
 - the userspace fallback is not specific to mq (very unfortunately
for me because I didn`t checked this exact regression week before when
I saw it for mq and it is not specific for queued patches for 2.1.1),
 - bug itself is not specific to mq, reproduces every time even with
more generic interface config without queues,
 - patch from above does not fix the issue.

Strace output for all threads is available at
http://xdel.ru/downloads/qemu.out.gz, attached just before reset.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]