[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] qemu-iotests RC0+ status
From: |
Greg Kurz |
Subject: |
Re: [Qemu-block] qemu-iotests RC0+ status |
Date: |
Fri, 13 Jul 2018 12:32:18 +0200 |
On Thu, 12 Jul 2018 18:16:05 -0400
John Snow <address@hidden> wrote:
> Hi, on Fedora 28 x64 host, as of 68f1b569 I'm seeing:
>
> `./check -v -qcow`
> - occasional stall on 052
> - stalls on 216
>
> `./check -v -qed`
> - stalls on 200
>
> `./check -v -luks`
> - failures on 226.
>
>
> 052 is something I can't reproduce. The test takes quite a while, so
> maybe at some points I am simply not being patient enough.
>
>
> 216 appears to have never worked for qcow1.
>
>
> 226 is my fault: the test doesn't handle LUKS well, because it sets
> TEST_IMG to something sneaky:
> 'driver=luks,key-secret=keysec0,file.filename=/path/to/src/qemu/bin/git/tests/qemu-iotests/scratch/t.luks'
> ... my test expected this to be a real file and doesn't suffer this
> particularly well. I've sent a patch.
>
>
> 200 appears to regress on this commit:
> f45280cbf66d8e58224f6a253d0ae2aa72cc6280 is the first bad commit
> commit f45280cbf66d8e58224f6a253d0ae2aa72cc6280
> Author: Greg Kurz <address@hidden>
> Date: Mon May 28 14:03:59 2018 +0200
>
Hi,
After a few tries, I could reproduce the hang.
(gdb) thread apply all bt
Thread 5 (Thread 0x7fbea5be8700 (LWP 482289)):
#0 0x00007fbeb4646caf in do_sigwait () at /lib64/libpthread.so.0
#1 0x00007fbeb4646d3d in sigwait () at /lib64/libpthread.so.0
#2 0x0000560b3e4b05b5 in qemu_dummy_cpu_thread_fn (arg=0x560b3f925750) at
/var/tmp/qemu/cpus.c:1260
#3 0x0000560b3ea5561d in qemu_thread_start (args=0x560b3f9449b0) at
util/qemu-thread-posix.c:504
#4 0x00007fbeb463c50b in start_thread () at /lib64/libpthread.so.0
#5 0x00007fbeb437416f in clone () at /lib64/libc.so.6
Thread 4 (Thread 0x7fbea70ef700 (LWP 482286)):
#0 0x00007fbeb4645b1d in __lll_lock_wait () at /lib64/libpthread.so.0
#1 0x00007fbeb463ef18 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2 0x0000560b3ea54aa6 in qemu_mutex_lock_impl (mutex=0x560b3f9032a0,
file=0x560b3ebd0c41 "util/async.c", line=511) at util/qemu-thread-posix.c:66
#3 0x0000560b3ea4dcf0 in aio_context_acquire (ctx=0x560b3f903240) at
util/async.c:511
#4 0x0000560b3ea4d88a in co_schedule_bh_cb (opaque=0x560b3f903240) at
util/async.c:399
#5 0x0000560b3ea4cfaa in aio_bh_call (bh=0x560b3f901a70) at util/async.c:90
#6 0x0000560b3ea4d042 in aio_bh_poll (ctx=0x560b3f903240) at util/async.c:118
#7 0x0000560b3ea525a6 in aio_poll (ctx=0x560b3f903240, blocking=true) at
util/aio-posix.c:689
#8 0x0000560b3e63d3b1 in iothread_run (opaque=0x560b3f8d3000) at iothread.c:64
#9 0x0000560b3ea5561d in qemu_thread_start (args=0x560b3f9035e0) at
util/qemu-thread-posix.c:504
#10 0x00007fbeb463c50b in start_thread () at /lib64/libpthread.so.0
#11 0x00007fbeb437416f in clone () at /lib64/libc.so.6
Thread 3 (Thread 0x7fbea78f0700 (LWP 482285)):
#0 0x00007fbeb4369d66 in ppoll () at /lib64/libc.so.6
#1 0x0000560b3ea4f59d in qemu_poll_ns (fds=0x7fbe98000b20, nfds=1, timeout=-1)
at util/qemu-timer.c:322
#2 0x0000560b3ea522d9 in aio_poll (ctx=0x560b3f82f320, blocking=true) at
util/aio-posix.c:629
#3 0x0000560b3e63d3b1 in iothread_run (opaque=0x560b3f82f050) at iothread.c:64
#4 0x0000560b3ea5561d in qemu_thread_start (args=0x560b3f8306a0) at
util/qemu-thread-posix.c:504
#5 0x00007fbeb463c50b in start_thread () at /lib64/libpthread.so.0
#6 0x00007fbeb437416f in clone () at /lib64/libc.so.6
Thread 2 (Thread 0x7fbea80f1700 (LWP 482284)):
#0 0x00007fbeb436eb99 in syscall () at /lib64/libc.so.6
#1 0x0000560b3ea552a9 in qemu_futex_wait (f=0x560b3f21c2f8
<rcu_call_ready_event>, val=4294967295) at /var/tmp/qemu/include/qemu/futex.h:29
#2 0x0000560b3ea55470 in qemu_event_wait (ev=0x560b3f21c2f8
<rcu_call_ready_event>) at util/qemu-thread-posix.c:442
#3 0x0000560b3ea6dcaa in call_rcu_thread (opaque=0x0) at util/rcu.c:261
#4 0x0000560b3ea5561d in qemu_thread_start (args=0x560b3f7cdaa0) at
util/qemu-thread-posix.c:504
#5 0x00007fbeb463c50b in start_thread () at /lib64/libpthread.so.0
#6 0x00007fbeb437416f in clone () at /lib64/libc.so.6
Thread 1 (Thread 0x7fbebca47d00 (LWP 482283)):
#0 0x00007fbeb4369d66 in ppoll () at /lib64/libc.so.6
#1 0x0000560b3ea4f59d in qemu_poll_ns (fds=0x560b3f914f50, nfds=1, timeout=-1)
at util/qemu-timer.c:322
#2 0x0000560b3ea522d9 in aio_poll (ctx=0x560b3f8f5f10, blocking=true) at
util/aio-posix.c:629
#3 0x0000560b3e9980fa in bdrv_do_drained_begin (bs=0x560b3f9086e0,
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at
block/io.c:390
#4 0x0000560b3e998167 in bdrv_drained_begin (bs=0x560b3f9086e0) at
block/io.c:396
#5 0x0000560b3e984ac8 in blk_drain (blk=0x560b3fb6b140) at
block/block-backend.c:1591
#6 0x0000560b3e982fa3 in blk_remove_bs (blk=0x560b3fb6b140) at
block/block-backend.c:775
#7 0x0000560b3e9824f8 in blk_delete (blk=0x560b3fb6b140) at
block/block-backend.c:401
#8 0x0000560b3e98271d in blk_unref (blk=0x560b3fb6b140) at
block/block-backend.c:450
#9 0x0000560b3e929a8c in block_job_free (job=0x560b3f874c00) at blockjob.c:98
#10 0x0000560b3e92b86e in job_unref (job=0x560b3f874c00) at job.c:367
#11 0x0000560b3e92c1ed in job_do_dismiss (job=0x560b3f874c00) at job.c:633
#12 0x0000560b3e92c2f9 in job_conclude (job=0x560b3f874c00) at job.c:659
#13 0x0000560b3e92c586 in job_finalize_single (job=0x560b3f874c00) at job.c:727
#14 0x0000560b3e92c771 in job_completed_txn_abort (job=0x560b3f874c00) at
job.c:783
#15 0x0000560b3e92cb6e in job_completed (job=0x560b3f874c00, ret=0, error=0x0)
at job.c:882
#16 0x0000560b3e683260 in stream_complete (job=0x560b3f874c00,
opaque=0x560b3f9035e0) at block/stream.c:96
#17 0x0000560b3e92ce4b in job_defer_to_main_loop_bh (opaque=0x560b3f903600) at
job.c:973
#18 0x0000560b3ea4cfaa in aio_bh_call (bh=0x7fbe9c001230) at util/async.c:90
#19 0x0000560b3ea4d042 in aio_bh_poll (ctx=0x560b3f8f5f10) at util/async.c:118
#20 0x0000560b3ea51bca in aio_dispatch (ctx=0x560b3f8f5f10) at
util/aio-posix.c:436
#21 0x0000560b3ea4d3dd in aio_ctx_dispatch (source=0x560b3f8f5f10,
callback=0x0, user_data=0x0) at util/async.c:261
#22 0x00007fbebc183b77 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#23 0x0000560b3ea50623 in glib_pollfds_poll () at util/main-loop.c:215
#24 0x0000560b3ea50691 in os_host_main_loop_wait (timeout=0) at
util/main-loop.c:238
#25 0x0000560b3ea5074a in main_loop_wait (nonblocking=0) at util/main-loop.c:497
#26 0x0000560b3e6462fc in main_loop () at vl.c:1866
#27 0x0000560b3e64dbf3 in main (argc=21, argv=0x7ffc4f767948,
envp=0x7ffc4f7679f8) at vl.c:4644
Thread 4 is iothread0.
#3 0x0000560b3ea4dcf0 in aio_context_acquire (ctx=0x560b3f903240) at
util/async.c:511
511 qemu_rec_mutex_lock(&ctx->lock);
(gdb) p ctx
$34 = (AioContext *) 0x560b3f903240
(gdb) p ctx->lock.lock.__data.__owner
$35 = 482283
482283 is the pid of the main thread, which seems to have acquired this aio
context
in job_defer_to_main_loop_bh().
#17 0x0000560b3e92ce4b in job_defer_to_main_loop_bh (opaque=0x560b3f903600) at
job.c:973
973 data->fn(data->job, data->opaque);
(gdb) p data->job->aio_context
$36 = (AioContext *) 0x560b3f903240
It looks like we have a deadlock here... Not sure how to debug that :-\
Cheers,
--
Greg
>
>
> Thanks,
> --js