qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/4] block/nbd: fix drain dead-lock because of nbd reconnect-


From: Eric Blake
Subject: Re: [PATCH 1/4] block/nbd: fix drain dead-lock because of nbd reconnect-delay
Date: Wed, 23 Sep 2020 10:08:57 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

On 9/3/20 2:02 PM, Vladimir Sementsov-Ogievskiy wrote:
We pause reconnect process during drained section. So, if we have some
requests, waiting for reconnect we should cancel them, otherwise they
deadlock the drained section.

How to reproduce:


Now Qemu is trying to reconnect, and dd-generated requests are waiting
for the connection (they will wait up to 60 seconds (see
reconnect-delay option above) and than fail). But suddenly, vm may
totally hang in the deadlock. You may need to increase reconnect-delay
period to catch the dead-lock.

VM doesn't respond because drain dead-lock happens in cpu thread with
global mutex taken. That's not good thing by itself and is not fixed
by this commit (true way is using iothreads). Still this commit fixes
drain dead-lock itself.

Note: probably, we can instead continue to reconnect during drained
section. To achieve this, we may move negotiation to the connect thread
to make it independent of bs aio context. But expanding drained section
doesn't seem good anyway. So, let's now fix the bug the simplest way.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
  block/nbd.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 9daf003bea..912ea27be7 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -242,6 +242,11 @@ static void coroutine_fn 
nbd_client_co_drain_begin(BlockDriverState *bs)
      }
nbd_co_establish_connection_cancel(bs, false);
+
+    if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+        s->state = NBD_CLIENT_CONNECTING_NOWAIT;
+        qemu_co_queue_restart_all(&s->free_sema);
+    }
  }

Reviewed-by: Eric Blake <eblake@redhat.com>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]