qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: migration-test hang, s390x host, aarch64 guest


From: Peter Xu
Subject: Re: migration-test hang, s390x host, aarch64 guest
Date: Tue, 3 Aug 2021 16:53:11 -0400

On Mon, Aug 02, 2021 at 09:55:55AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Maydell (peter.maydell@linaro.org) wrote:
> > migration-test hung again during 'make check'.
> 
> ccing in Peter Xu
> 
> > Process tree:
> > 
> > ubuntu   42067  0.0  0.0   5460  3156 ?        S    13:55   0:00
> >            \_ bash -o pipefail -c echo
> > 'MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> > QTEST_QEMU_IMG=./qemu-img
> > G_TEST_DBUS_DAEMON=/home/ubuntu/qemu/tests/dbus-vmstate-daemon.sh
> > QTEST_QEMU_BINARY=./qemu-system-aarch64
> > QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
> > tests/qtest/migration-test --tap -k' &&
> > MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> > QTEST_QEMU_IMG=./qemu-img
> > G_TEST_DBUS_DAEMON=/home/ubuntu/qemu/tests/dbus-vmstate-daemon.sh
> > QTEST_QEMU_BINARY=./qemu-system-aarch64
> > QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
> > tests/qtest/migration-test --tap -k -m quick < /dev/null |
> > ./scripts/tap-driver.pl --test-name="qtest-aarch64/migration-test"
> > ubuntu   42070  0.0  0.0  78844  3184 ?        Sl   13:55   0:01
> >                \_ tests/qtest/migration-test --tap -k -m quick
> > ubuntu   43248  0.0  4.1 1401368 160852 ?      Sl   13:55   0:03
> >                |   \_ ./qemu-system-aarch64 -qtest
> > unix:/tmp/qtest-42070.sock -qtest-log /dev/null -chardev
> > socket,path=/tmp/qtest-42070.qmp,id=char0 -mon
> > chardev=char0,mode=control -display none -accel kvm -accel tcg
> > -machine virt,gic-version=max -name source,debug-threads=on -m 150M
> > -serial file:/tmp/migration-test-SL7EkN/src_serial -cpu max -kernel
> > /tmp/migration-test-SL7EkN/bootsect -accel qtest
> > ubuntu   43256  0.0  1.4 1394208 57648 ?       Sl   13:55   0:00
> >                |   \_ ./qemu-system-aarch64 -qtest
> > unix:/tmp/qtest-42070.sock -qtest-log /dev/null -chardev
> > socket,path=/tmp/qtest-42070.qmp,id=char0 -mon
> > chardev=char0,mode=control -display none -accel kvm -accel tcg
> > -machine virt,gic-version=max -name target,debug-threads=on -m 150M
> > -serial file:/tmp/migration-test-SL7EkN/dest_serial -incoming
> > unix:/tmp/migration-test-SL7EkN/migsocket -cpu max -kernel
> > /tmp/migration-test-SL7EkN/bootsect -accel qtest
> > ubuntu   42071  0.0  0.2  14116 11372 ?        S    13:55   0:00
> >                \_ perl ./scripts/tap-driver.pl
> > --test-name=qtest-aarch64/migration-test
> > 
> > Backtrace, migration-test process:
> > 
> > Thread 2 (Thread 0x3ff8ff7f910 (LWP 42075)):
> > #0  syscall () at ../sysdeps/unix/sysv/linux/s390/s390-64/syscall.S:53
> > #1  0x000002aa1d0d1c1c in qemu_futex_wait (val=<optimized out>,
> > f=<optimized out>) at /home/ubuntu/qemu/include/qemu/futex.h:29
> > #2  qemu_event_wait (ev=ev@entry=0x2aa1d0fda58 <rcu_call_ready_event>)
> > at ../../util/qemu-thread-posix.c:480
> > #3  0x000002aa1d0d0884 in call_rcu_thread (opaque=opaque@entry=0x0) at
> > ../../util/rcu.c:258
> > #4  0x000002aa1d0d0c32 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #5  0x000003ff90207aa8 in start_thread (arg=0x3ff8ff7f910) at
> > pthread_create.c:463
> > #6  0x000003ff900fa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 1 (Thread 0x3ff905f7c00 (LWP 42070)):
> > #0  0x000003ff90212978 in __waitpid (pid=<optimized out>,
> > stat_loc=stat_loc@entry=0x2aa1d5e06bc, options=options@entry=0)
> >     at ../sysdeps/unix/sysv/linux/waitpid.c:30
> > #1  0x000002aa1d0a2176 in qtest_kill_qemu (s=0x2aa1d5e06b0) at
> > ../../tests/qtest/libqtest.c:144
> > #2  0x000003ff903c0de6 in g_hook_list_invoke () from
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #3  <signal handler called>
> > #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> > #5  0x000003ff9003f3ca in __GI_abort () at abort.c:79
> > #6  0x000003ff903fcbb2 in g_assertion_message () from
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #7  0x000003ff903fd2b4 in g_assertion_message_cmpstr () from
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #8  0x000002aa1d0a10f6 in check_migration_status
> > (ungoals=0x3ffffdfe198, goal=0x2aa1d0d75c0 "postcopy-paused",
> > who=0x2aa1d5dee90)
> 
>             g_assert_cmpstr(current_status, !=,  *ungoal);
> 
> >     at ../../tests/qtest/migration-helpers.c:146
> > #9  wait_for_migration_status (who=0x2aa1d5dee90, goal=<optimized
> > out>, ungoals=0x3ffffdfe198)
> >     at ../../tests/qtest/migration-helpers.c:156
> > #10 0x000002aa1d09f610 in test_postcopy_recovery () at
> > ../../tests/qtest/migration-test.c:773
> 
>     wait_for_migration_status(from, "postcopy-paused",
>                               (const char * []) { "failed", "active",
>                                                   "completed", NULL });
> 
> so I think that means it hit one of failed/active/completed rather than
> the postcopy-paused it expected (shame we can't tell which)
> 
> > #11 0x000003ff903fc70c in ?? () from 
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #12 0x000003ff903fc648 in ?? () from 
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #13 0x000003ff903fc648 in ?? () from 
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #14 0x000003ff903fc648 in ?? () from 
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #15 0x000003ff903fc8ee in g_test_run_suite () from
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #16 0x000003ff903fc928 in g_test_run () from
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #17 0x000002aa1d09d8f2 in main (argc=<optimized out>, argv=<optimized
> > out>) at ../../tests/qtest/migration-test.c:1495
> > 
> > 
> > Backtrace, QEMU process 43248:
> > 
> > Thread 5 (Thread 0x3ff8d7fe910 (LWP 43361)):
> > #0  0x000003ff9d110582 in futex_abstimed_wait_cancelable (private=0,
> > abstime=0x0, expected=0, futex_word=0x2aa1e27dfac)
> >     at ../sysdeps/unix/sysv/linux/futex-internal.h:205
> > #1  0x000003ff9d110582 in do_futex_wait (sem=sem@entry=0x2aa1e27dfa8,
> > abstime=0x0) at sem_waitcommon.c:111
> > #2  0x000003ff9d11068c in __new_sem_wait_slow
> > (sem=sem@entry=0x2aa1e27dfa8, abstime=0x0) at sem_waitcommon.c:181
> > #3  0x000003ff9d110734 in __new_sem_wait (sem=sem@entry=0x2aa1e27dfa8)
> > at sem_wait.c:42
> > #4  0x000002aa1b6dfb96 in qemu_sem_wait (sem=0x2aa1e27dfa8,
> > sem@entry=<error reading variable: value has been optimized out>)
> >     at ../../util/qemu-thread-posix.c:357
> > #5  0x000002aa1aefe5b0 in postcopy_pause_return_path_thread
> > (s=0x2aa1e27dc00) at ../../migration/migration.c:2654
> > #6  0x000002aa1aefe5b0 in source_return_path_thread
> > (opaque=opaque@entry=0x2aa1e27dc00) at
> > ../../migration/migration.c:2854
> > #7  0x000002aa1b6ded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #8  0x000003ff9d107aa8 in start_thread (arg=0x3ff8d7fe910) at
> > pthread_create.c:463
> > #9  0x000003ff9cffa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> 
> OK, source waiting for recovery - my reading is we don't send the
> recovery until the next command in the test.
> 
> > 
> > Thread 4 (Thread 0x3ff8ed42910 (LWP 43252)):
> > #0  0x000003ff9d10d9fe in futex_wait_cancelable (private=<optimized
> > out>, expected=0, futex_word=0x2aa1e49ca68)
> >     at ../sysdeps/unix/sysv/linux/futex-internal.h:88
> > #1  0x000003ff9d10d9fe in __pthread_cond_wait_common (abstime=0x0,
> > mutex=0x2aa1c1a4840 <qemu_global_mutex>, cond=0x2aa1e49ca40)
> >     at pthread_cond_wait.c:502
> > #2  0x000003ff9d10d9fe in __pthread_cond_wait
> > (cond=cond@entry=0x2aa1e49ca40, mutex=mutex@entry=0x2aa1c1a4840
> > <qemu_global_mutex>)
> >     at pthread_cond_wait.c:655
> > #3  0x000002aa1b6df59e in qemu_cond_wait_impl (cond=0x2aa1e49ca40,
> > mutex=0x2aa1c1a4840 <qemu_global_mutex>, file=0x2aa1b8a93c6
> > "../../softmmu/cpus.c", line=<optimized out>) at
> > ../../util/qemu-thread-posix.c:194
> > #4  0x000002aa1b40764e in qemu_wait_io_event
> > (cpu=cpu@entry=0x2aa1e43aa50) at ../../softmmu/cpus.c:419
> > #5  0x000002aa1b406916 in mttcg_cpu_thread_fn
> > (arg=arg@entry=0x2aa1e43aa50) at
> > ../../accel/tcg/tcg-accel-ops-mttcg.c:98
> > #6  0x000002aa1b6ded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #7  0x000003ff9d107aa8 in start_thread (arg=0x3ff8ed42910) at
> > pthread_create.c:463
> > #8  0x000003ff9cffa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 3 (Thread 0x3ff8f735910 (LWP 43251)):
> > #0  0x000003ff9cfef1b0 in __GI___poll (fds=0x3ff64003620, nfds=3,
> > timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:29
> > #1  0x000003ff9e852624 in  () at /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #2  0x000003ff9e852aa8 in g_main_loop_run () at
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #3  0x000002aa1b4eaf0e in iothread_run
> > (opaque=opaque@entry=0x2aa1df83b80) at ../../iothread.c:73
> > #4  0x000002aa1b6ded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #5  0x000003ff9d107aa8 in start_thread (arg=0x3ff8f735910) at
> > pthread_create.c:463
> > #6  0x000003ff9cffa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 2 (Thread 0x3ff90037910 (LWP 43250)):
> > #0  0x000003ff9cff52ea in syscall () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/syscall.S:53
> > #1  0x000002aa1b6dfd5c in qemu_futex_wait (val=<optimized out>,
> > f=<optimized out>) at /home/ubuntu/qemu/include/qemu/futex.h:29
> > #2  0x000002aa1b6dfd5c in qemu_event_wait (ev=ev@entry=0x2aa1c1c3888
> > <rcu_call_ready_event>) at ../../util/qemu-thread-posix.c:480
> > #3  0x000002aa1b6d917c in call_rcu_thread (opaque=opaque@entry=0x0) at
> > ../../util/rcu.c:258
> > #4  0x000002aa1b6ded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #5  0x000003ff9d107aa8 in start_thread (arg=0x3ff90037910) at
> > pthread_create.c:463
> > #6  0x000003ff9cffa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 1 (Thread 0x3ff9feb5470 (LWP 43248)):
> > #0  0x000003ff9cfef302 in __GI_ppoll (fds=0x2aa1ed28d00, nfds=6,
> > timeout=<optimized out>,
> >     timeout@entry=0x3fff517ecf8, sigmask=sigmask@entry=0x0) at
> > ../sysdeps/unix/sysv/linux/ppoll.c:39
> > #1  0x000002aa1b701fec in ppoll (__ss=0x0, __timeout=0x3fff517ecf8,
> > __nfds=<optimized out>, __fds=<optimized out>)
> >     at /usr/include/s390x-linux-gnu/bits/poll2.h:77
> > #2  0x000002aa1b701fec in qemu_poll_ns (fds=<optimized out>,
> > nfds=<optimized out>, timeout=timeout@entry=1000000000)
> >     at ../../util/qemu-timer.c:348
> > #3  0x000002aa1b6d2f98 in os_host_main_loop_wait (timeout=1000000000)
> > at ../../util/main-loop.c:250
> > #4  0x000002aa1b6d2f98 in main_loop_wait
> > (nonblocking=nonblocking@entry=0) at ../../util/main-loop.c:531
> > #5  0x000002aa1b496d44 in qemu_main_loop () at ../../softmmu/runstate.c:726
> > #6  0x000002aa1ae6d3f0 in main (argc=<optimized out>, argv=<optimized
> > out>, envp=<optimized out>) at ../../softmmu/main.c:50
> > 
> > Backtrace, QEMU process 43256:
> > 
> > Thread 6 (Thread 0x3ff726ff910 (LWP 43440)):
> > #0  0x000003ffb9f10582 in futex_abstimed_wait_cancelable (private=0,
> > abstime=0x0, expected=0, futex_word=0x2aa1a75d2d4)
> >     at ../sysdeps/unix/sysv/linux/futex-internal.h:205
> > #1  0x000003ffb9f10582 in do_futex_wait (sem=sem@entry=0x2aa1a75d2d0,
> > abstime=0x0) at sem_waitcommon.c:111
> > #2  0x000003ffb9f1068c in __new_sem_wait_slow
> > (sem=sem@entry=0x2aa1a75d2d0, abstime=0x0) at sem_waitcommon.c:181
> > #3  0x000003ffb9f10734 in __new_sem_wait (sem=sem@entry=0x2aa1a75d2d0)
> > at sem_wait.c:42
> > #4  0x000002aa17ddfb96 in qemu_sem_wait (sem=sem@entry=0x2aa1a75d2d0)
> > at ../../util/qemu-thread-posix.c:357
> > #5  0x000002aa17599f06 in postcopy_pause_incoming (mis=<optimized
> > out>) at ../../migration/savevm.c:2600
> 
> So the dest is still waiting to be woken up from pause? Hmm I thought
> the test had already given it a kick.
> 
> > #6  0x000002aa17599f06 in qemu_loadvm_state_main
> > (f=f@entry=0x2aa1a751610, mis=mis@entry=0x2aa1a75d130)
> >     at ../../migration/savevm.c:2674
> > #7  0x000002aa1759af4a in postcopy_ram_listen_thread
> > (opaque=opaque@entry=0x0) at ../../migration/savevm.c:1872
> > #8  0x000002aa17dded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #9  0x000003ffb9f07aa8 in start_thread (arg=0x3ff726ff910) at
> > pthread_create.c:463
> > #10 0x000003ffb9dfa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 5 (Thread 0x3ff88ffe910 (LWP 43439)):
> > #0  0x000003ffb9f10582 in futex_abstimed_wait_cancelable (private=0,
> > abstime=0x0, expected=0, futex_word=0x2aa1a75d2fc)
> >     at ../sysdeps/unix/sysv/linux/futex-internal.h:205
> > #1  0x000003ffb9f10582 in do_futex_wait (sem=sem@entry=0x2aa1a75d2f8,
> > abstime=0x0) at sem_waitcommon.c:111
> > #2  0x000003ffb9f1068c in __new_sem_wait_slow
> > (sem=sem@entry=0x2aa1a75d2f8, abstime=0x0) at sem_waitcommon.c:181
> > #3  0x000003ffb9f10734 in __new_sem_wait (sem=sem@entry=0x2aa1a75d2f8)
> > at sem_wait.c:42
> > #4  0x000002aa17ddfb96 in qemu_sem_wait (sem=0x2aa1a75d2f8,
> > sem@entry=<error reading variable: value has been optimized out>)
> >     at ../../util/qemu-thread-posix.c:357
> > #5  0x000002aa177ae6aa in postcopy_pause_fault_thread (mis=<optimized
> > out>) at ../../migration/postcopy-ram.c:847
> > #6  0x000002aa177ae6aa in postcopy_ram_fault_thread
> > (opaque=opaque@entry=0x2aa1a75d130) at
> > ../../migration/postcopy-ram.c:911
> > #7  0x000002aa17dded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #8  0x000003ffb9f07aa8 in start_thread (arg=0x3ff88ffe910) at
> > pthread_create.c:463
> > #9  0x000003ffb9dfa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 4 (Thread 0x3ff89b18910 (LWP 43271)):
> > Python Exception <class 'gdb.error'> PC not saved:
> > 
> > Thread 3 (Thread 0x3ff8a319910 (LWP 43270)):
> > #0  0x000003ffb9def1b0 in __GI___poll (fds=0x3ff84003620, nfds=3,
> > timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:29
> > #1  0x000003ffbb652624 in  () at /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #2  0x000003ffbb652aa8 in g_main_loop_run () at
> > /usr/lib/s390x-linux-gnu/libglib-2.0.so.0
> > #3  0x000002aa17beaf0e in iothread_run
> > (opaque=opaque@entry=0x2aa1a3aeb80) at ../../iothread.c:73
> > #4  0x000002aa17dded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #5  0x000003ffb9f07aa8 in start_thread (arg=0x3ff8a319910) at
> > pthread_create.c:463
> > #6  0x000003ffb9dfa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 2 (Thread 0x3fface37910 (LWP 43267)):
> > #0  0x000003ffb9df52ea in syscall () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/syscall.S:53
> > #1  0x000002aa17ddfd5c in qemu_futex_wait (val=<optimized out>,
> > f=<optimized out>) at /home/ubuntu/qemu/include/qemu/futex.h:29
> > #2  0x000002aa17ddfd5c in qemu_event_wait (ev=ev@entry=0x2aa188c3888
> > <rcu_call_ready_event>) at ../../util/qemu-thread-posix.c:480
> > #3  0x000002aa17dd917c in call_rcu_thread (opaque=opaque@entry=0x0) at
> > ../../util/rcu.c:258
> > #4  0x000002aa17dded72 in qemu_thread_start (args=<optimized out>) at
> > ../../util/qemu-thread-posix.c:541
> > #5  0x000003ffb9f07aa8 in start_thread (arg=0x3fface37910) at
> > pthread_create.c:463
> > #6  0x000003ffb9dfa11e in thread_start () at
> > ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:65
> > 
> > Thread 1 (Thread 0x3ffbccb5470 (LWP 43256)):
> > #0  0x000003ffb9f0d9fe in futex_wait_cancelable (private=<optimized
> > out>, expected=0, futex_word=0x2aa188a47e8 <qemu_pause_cond+40>)
> >     at ../sysdeps/unix/sysv/linux/futex-internal.h:88
> > #1  0x000003ffb9f0d9fe in __pthread_cond_wait_common (abstime=0x0,
> > mutex=0x2aa188a4840 <qemu_global_mutex>, cond=0x2aa188a47c0
> > <qemu_pause_cond>) at pthread_cond_wait.c:502
> > #2  0x000003ffb9f0d9fe in __pthread_cond_wait
> > (cond=cond@entry=0x2aa188a47c0 <qemu_pause_cond>,
> > mutex=mutex@entry=0x2aa188a4840 <qemu_global_mutex>) at
> > pthread_cond_wait.c:655
> > #3  0x000002aa17ddf59e in qemu_cond_wait_impl (cond=0x2aa188a47c0
> > <qemu_pause_cond>, mutex=0x2aa188a4840 <qemu_global_mutex>,
> > file=0x2aa17fa93c6 "../../softmmu/cpus.c", line=<optimized out>) at
> > ../../util/qemu-thread-posix.c:194
> > #4  0x000002aa17b079e2 in pause_all_vcpus () at ../../softmmu/cpus.c:562
> > #5  0x000002aa17b07aec in do_vm_stop
> > (state=state@entry=RUN_STATE_SHUTDOWN,
> > send_stop=send_stop@entry=false)
> >     at ../../softmmu/cpus.c:261
> 
> Hang on; how is the dest (?) in shutdown?

Yeh, the other "mistery" is we don't have migration thread running on src at
all...  So I'm thinking maybe the migration failed somehow, and migration
thread quitted already.

Normally for recovery test we'll be stuck in postcopy_pause() on the src, if
postcopy_do_resume() failed we'll stick with the postcopy_pause_sem and wait
forever there; that won't eat the migration thread.

So it's gone only if we went the other way to MIG_THR_ERR_FATAL, I think.  That
can possibly be caused by not reaching MIGRATION_STATUS_POSTCOPY_RECOVER state
for the previous "migrate -r" command.  However I've no clue on why.

Maybe.. the stderr could help on what has failed; but by default they're
dropped.  So I'm wondering whether we should apply some patch like this to the
test temporarily:

---8<---
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index cc5e83d98a..d4e3c78acc 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -725,7 +725,7 @@ static void test_postcopy_recovery(void)
     QTestState *from, *to;
     g_autofree char *uri = NULL;
 
-    args->hide_stderr = true;
+    args->hide_stderr = false;
 
     if (migrate_postcopy_prepare(&from, &to, args)) {
         return;
---8<---

So if postcopy recovery is the only thing that fails, then we enable stderr for
this test. When it crashes we have a chance to get the stderr of both src/dst
qemu, maybe?

Thanks,

> 
> > #6  0x000002aa17b07bb6 in vm_shutdown () at ../../softmmu/cpus.c:280
> > #7  0x000002aa17b9716e in qemu_cleanup () at ../../softmmu/runstate.c:812
> > #8  0x000002aa1756d3f6 in main (argc=<optimized out>, argv=<optimized
> > out>, envp=<optimized out>) at ../../softmmu/main.c:51
> > 
> > No, I dunno why thread 4 was odd. Backtracing that thread seems to
> > only work about one time in two. Here's the backtrace from a
> > success:
> 
> That maybe due to postcopy userfault; it can confuse things sometimes.
> 
> > #0  0x000002aa17b1ca2e in load_memop (op=MO_8, haddr=0x3ff80c6d000) at
> > ../../accel/tcg/cputlb.c:1860
> 
> So just a read, on a page boundary?  So possibly triggered a page read
> over postcopy?
> 
> Dave
> 
> > #1  0x000002aa17b1ca2e in load_helper (full_load=<optimized out>,
> > code_read=false, op=MO_8, retaddr=4396070052416, oi=<optimized out>,
> > addr=1177997312, env=0x2aa1a86f890) at ../../accel/tcg/cputlb.c:1980
> > #2  0x000002aa17b1ca2e in full_ldub_mmu (env=0x2aa1a86f890,
> > addr=1177997312, oi=<optimized out>, retaddr=4396070052416)
> >     at ../../accel/tcg/cputlb.c:1996
> > #3  0x000003ff8a31a35e in code_gen_buffer ()
> > #4  0x000002aa17bbabb4 in cpu_tb_exec (tb_exit=<synthetic pointer>,
> > itb=<optimized out>, cpu=<optimized out>)
> >     at ../../accel/tcg/cpu-exec.c:353
> > #5  0x000002aa17bbabb4 in cpu_loop_exec_tb (tb_exit=<synthetic
> > pointer>, last_tb=<synthetic pointer>, tb=<optimized out>,
> > cpu=<optimized out>) at ../../accel/tcg/cpu-exec.c:812
> > #6  0x000002aa17bbabb4 in cpu_exec (cpu=<optimized out>) at
> > ../../accel/tcg/cpu-exec.c:970
> > #7  0x000002aa18764620 in  ()
> > 
> > -- PMM
> > 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]