qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 18/20] block/nbd: drop connection_co


From: Fabian Ebner
Subject: Re: [PULL 18/20] block/nbd: drop connection_co
Date: Wed, 2 Feb 2022 12:49:36 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

Am 27.09.21 um 23:55 schrieb Eric Blake:
> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> OK, that's a big rewrite of the logic.
> 
> Pre-patch we have an always running coroutine - connection_co. It does
> reply receiving and reconnecting. And it leads to a lot of difficult
> and unobvious code around drained sections and context switch. We also
> abuse bs->in_flight counter which is increased for connection_co and
> temporary decreased in points where we want to allow drained section to
> begin. One of these place is in another file: in nbd_read_eof() in
> nbd/client.c.
> 
> We also cancel reconnect and requests waiting for reconnect on drained
> begin which is not correct. And this patch fixes that.
> 
> Let's finally drop this always running coroutine and go another way:
> do both reconnect and receiving in request coroutines.
>

Hi,

while updating our stack to 6.2, one of our live-migration tests stopped
working (backtrace is below) and bisecting led me to this patch.

The VM has a single qcow2 disk (converting to raw doesn't make a
difference) and the issue only appears when using iothread (for both
virtio-scsi-pci and virtio-block-pci).

Reverting 1af7737871fb3b66036f5e520acb0a98fc2605f7 (which lives on top)
and 4ddb5d2fde6f22b2cf65f314107e890a7ca14fcf (the commit corresponding
to this patch) in v6.2.0 makes the migration work again.

Backtrace:

Thread 1 (Thread 0x7f9d93458fc0 (LWP 56711) "kvm"):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f9d9d6bc537 in __GI_abort () at abort.c:79
#2  0x00007f9d9d6bc40f in __assert_fail_base (fmt=0x7f9d9d825128
"%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5579153763f8
"qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)",
file=0x5579153764f9 "../io/channel.c", line=483, function=<optimized
out>) at assert.c:92
#3  0x00007f9d9d6cb662 in __GI___assert_fail
(assertion=assertion@entry=0x5579153763f8
"qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)",
file=file@entry=0x5579153764f9 "../io/channel.c", line=line@entry=483,
function=function@entry=0x557915376570 <__PRETTY_FUNCTION__.2>
"qio_channel_restart_read") at assert.c:101
#4  0x00005579150c351c in qio_channel_restart_read (opaque=<optimized
out>) at ../io/channel.c:483
#5  qio_channel_restart_read (opaque=<optimized out>) at ../io/channel.c:477
#6  0x000055791520182a in aio_dispatch_handler
(ctx=ctx@entry=0x557916908c60, node=0x7f9d8400f800) at
../util/aio-posix.c:329
#7  0x0000557915201f62 in aio_dispatch_handlers (ctx=0x557916908c60) at
../util/aio-posix.c:372
#8  aio_dispatch (ctx=0x557916908c60) at ../util/aio-posix.c:382
#9  0x00005579151ea74e in aio_ctx_dispatch (source=<optimized out>,
callback=<optimized out>, user_data=<optimized out>) at ../util/async.c:311
#10 0x00007f9d9e647e6b in g_main_context_dispatch () from
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x0000557915203030 in glib_pollfds_poll () at ../util/main-loop.c:232
#12 os_host_main_loop_wait (timeout=992816) at ../util/main-loop.c:255
#13 main_loop_wait (nonblocking=nonblocking@entry=0) at
../util/main-loop.c:531
#14 0x00005579150539c1 in qemu_main_loop () at ../softmmu/runstate.c:726
#15 0x0000557914ce8ebe in main (argc=<optimized out>, argv=<optimized
out>, envp=<optimized out>) at ../softmmu/main.c:50





reply via email to

[Prev in Thread] Current Thread [Next in Thread]