qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Fwd: [RFC PATCH 00/17] reverse debugging


From: Ciro Santilli
Subject: [Qemu-devel] Fwd: [RFC PATCH 00/17] reverse debugging
Date: Sat, 28 Apr 2018 09:14:25 +0100

Forgetting about debugging, I belive there is a deadlock in the replay
at 63d426dfa4fbfac3d50cda3f553cd975de2b85ea , but it is rare.

I have only reproduced it on ARM so far, and I haven't checked pre-patch.

The setup is 
https://github.com/cirosantilli/qemu-test/tree/6a3497f0d84e7c86ef80f7322e24e8a149b93214
with images-ab21ef58deed8536bc159c2afd680a4fabd68510.zip

Then try to run it several times with:

i=0; while true; do date; echo $i; ../qemu-test/arm/rr; i=$(($i+1)); done

I think the deadlock can happen in a few different places, but the
most common is when the kernel is doing disk related stuff, the last
messages before getting stuck are:

[   11.530325] ALSA device list:
[   11.531451]   No soundcards found.

and what would follow on a normal replay would be:

[   11.551904] EXT4-fs (vda): couldn't mount as ext3 due to feature
incompatibilities
[   11.619238] EXT4-fs (vda): mounted filesystem without journal. Opts: (null)

I then attach GDB with:

gdb -q ./arm-softmmu/qemu-system-arm `pgrep qemu`

and then:

>>> thread apply all bt

Thread 5 (Thread 0x7f59c6efb700 (LWP 22096)):
#0  0x00007f59e7aa9072 in futex_wait_cancelable (private=<optimized
out>, expected=0, futex_word=0x55a8e99801d8) at
../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  0x00007f59e7aa9072 in __pthread_cond_wait_common (abstime=0x0,
mutex=0x55a8e89cbf40 <qemu_global_mutex>, cond=0x55a8e99801b0) at
pthread_cond_wait.c:502
#2  0x00007f59e7aa9072 in __pthread_cond_wait (cond=0x55a8e99801b0,
mutex=0x55a8e89cbf40 <qemu_global_mutex>) at pthread_cond_wait.c:655
#3  0x000055a8e7f4f178 in qemu_cond_wait_impl (cond=0x55a8e99801b0,
mutex=0x55a8e89cbf40 <qemu_global_mutex>, file=0x55a8e80b10a8
"/home/ciro/git/qemu/cpus.c", line=1175) at
util/qemu-thread-posix.c:164
#4  0x000055a8e7999965 in qemu_tcg_rr_wait_io_event
(cpu=0x55a8e986b330) at /home/ciro/git/qemu/cpus.c:1175
#5  0x000055a8e799a1f5 in qemu_tcg_rr_cpu_thread_fn
(arg=0x55a8e986b330) at /home/ciro/git/qemu/cpus.c:1502
#6  0x00007f59e7aa27fc in start_thread (arg=0x7f59c6efb700) at
pthread_create.c:465
#7  0x00007f59e77cfb5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 4 (Thread 0x7f59c76fc700 (LWP 22095)):
#0  0x00007f59e77c3a4b in __GI_ppoll (fds=0x7f59b8000b10, nfds=1,
timeout=<optimized out>, sigmask=0x0) at
../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x000055a8e7f4a02e in qemu_poll_ns (fds=0x7f59b8000b10, nfds=1,
timeout=-1) at util/qemu-timer.c:322
#2  0x000055a8e7f4cb5e in aio_poll (ctx=0x55a8e978eab0, blocking=true)
at util/aio-posix.c:629
#3  0x000055a8e7b5f084 in iothread_run (opaque=0x55a8e970c710) at
iothread.c:64
#4  0x00007f59e7aa27fc in start_thread (arg=0x7f59c76fc700) at
pthread_create.c:465
#5  0x00007f59e77cfb5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7f59ced65700 (LWP 22093)):
#0  0x00007f59e77c9a49 in syscall () at
../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007f59e88456ef in g_cond_wait () at
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x000055a8e7f43157 in wait_for_trace_records_available () at
trace/simple.c:150
#3  0x000055a8e7f431b8 in writeout_thread (opaque=0x0) at
trace/simple.c:169
#4  0x00007f59e8827645 in  () at
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007f59e7aa27fc in start_thread (arg=0x7f59ced65700) at
pthread_create.c:465
#6  0x00007f59e77cfb5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7f59cf566700 (LWP 22092)):
#0  0x00007f59e77c9a49 in syscall () at
../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x000055a8e7f4f5d8 in qemu_futex_wait (f=0x55a8e8e48418
<rcu_call_ready_event>, val=4294967295) at
/home/ciro/git/qemu/include/qemu/futex.h:29
#2  0x000055a8e7f4f79f in qemu_event_wait (ev=0x55a8e8e48418
<rcu_call_ready_event>) at util/qemu-thread-posix.c:445
#3  0x000055a8e7f67d2d in call_rcu_thread (opaque=0x0) at
util/rcu.c:261
#4  0x00007f59e7aa27fc in start_thread (arg=0x7f59cf566700) at
pthread_create.c:465
#5  0x00007f59e77cfb5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7f59ecf03280 (LWP 22091)):
#0  0x00007f59e77c3a4b in __GI_ppoll (fds=0x55a8e9860aa0, nfds=5,
timeout=<optimized out>, sigmask=0x0) at
../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x000055a8e7f4a0c4 in qemu_poll_ns (fds=0x55a8e9860aa0, nfds=5,
timeout=1000000000) at util/qemu-timer.c:334
#2  0x000055a8e7f4b176 in os_host_main_loop_wait (timeout=1000000000)
at util/main-loop.c:258
#3  0x000055a8e7f4b241 in main_loop_wait (nonblocking=0) at
util/main-loop.c:522
#4  0x000055a8e7b66fed in main_loop () at vl.c:1943
#5  0x000055a8e7b6ead4 in main (argc=24, argv=0x7fff6fe0f328,
envp=0x7fff6fe0f3f0) at vl.c:4740

On Wed, Apr 25, 2018 at 1:45 PM, Pavel Dovgalyuk
<address@hidden> wrote:
> GDB remote protocol supports reverse debugging of the targets.
> It includes 'reverse step' and 'reverse continue' operations.
> The first one finds the previous step of the execution,
> and the second one is intended to stop at the last breakpoint that
> would happen when the program is executed normally.
>
> Reverse debugging is possible in the replay mode, when at least
> one snapshot was created at the record or replay phase.
> QEMU can use these snapshots for travelling back in time with GDB.
>
> Running the execution in replay mode allows using GDB reverse debugging
> commands:
>  - reverse-stepi (or rsi): Steps one instruction to the past.
>    QEMU loads on of the prior snapshots and proceeds to the desired
>    instruction forward. When that step is reaches, execution stops.
>  - reverse-continue (or rc): Runs execution "backwards".
>    QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
>    and replaying the execution. Then QEMU loads snapshots again and
>    replays to the latest breakpoint. When there are no breakpoints in
>    the examined section of the execution, QEMU finds one more snapshot
>    and tries again. After the first snapshot is processed, execution
>    stops at this snapshot.
>
> The set of patches include the following modifications:
>  - gdbstub update for reverse debugging support
>  - functions that automatically perform reverse step and reverse
>    continue operations
>  - hmp/qmp commands for manipulating the replay process
>  - improvement of the snapshotting for saving the execution step
>    in the snapshot parameters
>  - other record/replay fixes
>
> The patches are available in the repository:
> https://github.com/ispras/qemu/tree/rr-180207
>
> ---
>
> Pavel Dovgalyuk (17):
>       block: implement bdrv_snapshot_goto for blkreplay
>       replay: disable default snapshot for record/replay
>       replay: update docs for record/replay with block devices
>       replay: don't drain/flush bdrv queue while RR is working
>       replay: finish record/replay before closing the disks
>       migration: introduce icount field for snapshots
>       qcow2: introduce icount field for snapshots
>       replay: introduce info hmp/qmp command
>       replay: introduce breakpoint at the specified step
>       replay: implement replay_seek command to proceed to the desired step
>       replay: flush events when exitting
>       timer: remove replay clock probe in deadline calculation
>       replay: refine replay-time module
>       translator: fix breakpoint processing
>       replay: flush rr queue before loading the vmstate
>       gdbstub: add reverse step support in replay mode
>       gdbstub: add reverse continue support in replay mode
>
>
>  accel/tcg/translator.c    |    8 +
>  block/blkreplay.c         |    8 +
>  block/io.c                |   22 +++
>  block/qapi.c              |   11 +-
>  block/qcow2-snapshot.c    |    9 +
>  block/qcow2.h             |    2
>  blockdev.c                |    3
>  cpus.c                    |   19 ++-
>  docs/replay.txt           |   12 +-
>  exec.c                    |    6 +
>  gdbstub.c                 |   50 +++++++-
>  hmp-commands-info.hx      |   14 ++
>  hmp-commands.hx           |   30 +++++
>  hmp.h                     |    3
>  include/block/snapshot.h  |    1
>  include/sysemu/replay.h   |   18 +++
>  migration/savevm.c        |   11 +-
>  qapi/block-core.json      |    5 +
>  qapi/block.json           |    3
>  qapi/misc.json            |   69 +++++++++++
>  replay/Makefile.objs      |    3
>  replay/replay-debugging.c |  286 
> +++++++++++++++++++++++++++++++++++++++++++++
>  replay/replay-events.c    |   14 --
>  replay/replay-internal.h  |   10 +-
>  replay/replay-time.c      |   27 ++--
>  replay/replay.c           |   22 +++
>  stubs/replay.c            |   10 ++
>  util/qemu-timer.c         |   11 --
>  vl.c                      |   11 +-
>  29 files changed, 625 insertions(+), 73 deletions(-)
>  create mode 100644 replay/replay-debugging.c
>
> --
> Pavel Dovgalyuk



reply via email to

[Prev in Thread] Current Thread [Next in Thread]