[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race |
Date: |
Tue, 28 Jul 2015 11:31:49 +0100 |
On Tue, Jul 28, 2015 at 11:26 AM, Cornelia Huck
<address@hidden> wrote:
> On Tue, 28 Jul 2015 09:34:46 +0100
> Stefan Hajnoczi <address@hidden> wrote:
>
>> On Tue, Jul 28, 2015 at 10:02:26AM +0200, Cornelia Huck wrote:
>> > On Tue, 28 Jul 2015 09:07:00 +0200
>> > Cornelia Huck <address@hidden> wrote:
>> >
>> > > On Mon, 27 Jul 2015 17:33:37 +0100
>> > > Stefan Hajnoczi <address@hidden> wrote:
>> > >
>> > > > See Patch 2 for details on the deadlock after two
>> > > > aio_context_acquire() calls
>> > > > race. This caused dataplane to hang on startup.
>> > > >
>> > > > Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
>> > > >
>> > > > Stefan Hajnoczi (2):
>> > > > AioContext: avoid leaking BHs on cleanup
>> > > > AioContext: force event loop iteration using BH
>> > > >
>> > > > async.c | 29 +++++++++++++++++++++++++++--
>> > > > include/block/aio.h | 3 +++
>> > > > 2 files changed, 30 insertions(+), 2 deletions(-)
>> > > >
>> > >
>> > > Just gave this a try: The stripped-down guest that hangs during startup
>> > > on master is working fine with these patches applied, and my full setup
>> > > works as well.
>> > >
>> > > So,
>> > >
>> > > Tested-by: Cornelia Huck <address@hidden>
>> >
>> > Uh-oh, spoke too soon. It starts, but when I try a virsh managedsave, I
>> > get
>> >
>> > qemu-system-s390x: /data/git/yyy/qemu/async.c:242: aio_ctx_finalize:
>> > Assertion `ctx->first_bh->deleted' failed.
>>
>> Please pretty-print ctx->first_bh in gdb. In particular, which function
>> is ctx->first_bh->cb pointing to?
>
> (gdb) p/x *(QEMUBH *)ctx->first_bh
> $2 = {ctx = 0x9aab3730, cb = 0x801b7c5c, opaque = 0x3ff9800dee0, next =
> 0x3ff9800dfb0, scheduled = 0x0, idle = 0x0, deleted = 0x0}
>
> cb is pointing at spawn_thread_bh_fn.
>
>>
>> I tried reproducing with qemu-system-x86_64 and a RHEL 7 guest but
>> couldn't trigger the assertion failure.
>
> I use the old x-data-plane attribute; if I turn it off, I don't hit the
> assertion.
Thanks. I understand how to reproduce it now: use -drive aio=threads
and do I/O during managedsave.
I suspect there are more cases of this. We need to clean it up during QEMU 2.5.
For now let's continue leaking these BHs as we've always done.
Stefan
- [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Stefan Hajnoczi, 2015/07/27
- [Qemu-devel] [PATCH for-2.4 1/2] AioContext: avoid leaking BHs on cleanup, Stefan Hajnoczi, 2015/07/27
- [Qemu-devel] [PATCH for-2.4 2/2] AioContext: force event loop iteration using BH, Stefan Hajnoczi, 2015/07/27
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Cornelia Huck, 2015/07/28
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Cornelia Huck, 2015/07/28
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Stefan Hajnoczi, 2015/07/28
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Cornelia Huck, 2015/07/28
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race,
Stefan Hajnoczi <=
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Stefan Hajnoczi, 2015/07/28
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Cornelia Huck, 2015/07/28
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Paolo Bonzini, 2015/07/28
- Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race, Stefan Hajnoczi, 2015/07/28