[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
s390 migration crash
From: |
Dr. David Alan Gilbert |
Subject: |
s390 migration crash |
Date: |
Tue, 21 Mar 2023 20:24:37 +0000 |
User-agent: |
Mutt/2.2.9 (2022-11-12) |
Hi Peter's,
Peter M pointed me to a seg in a migration test in CI; I can reproduce
it:
* On an s390 host
* only as part of a make check - running migration-test by itself
doesn't trigger for me.
* It looks like it's postcopy preempt
(gdb) bt full
#0 iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at
../util/iov.c:88
len = 13517923312037845750
i = 17305
#1 0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at
../migration/qemu-file.c:307
local_error = 0x0
#2 0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at
../migration/qemu-file.c:297
#3 0x000002aa00613962 in postcopy_preempt_shutdown_file
(s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
#4 0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at
../migration/migration.c:3469
ret = <optimized out>
current_active_state = 5
must_precopy = 0
can_postcopy = 0
in_postcopy = true
pending_size = 0
__func__ = "migration_iteration_run"
iter_state = <optimized out>
s = 0x2aa00d1b4e0
thread = <optimized out>
setup_start = <optimized out>
thr_error = <optimized out>
urgent = <optimized out>
#5 migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
must_precopy = 0
can_postcopy = 0
in_postcopy = true
pending_size = 0
__func__ = "migration_iteration_run"
iter_state = <optimized out>
s = 0x2aa00d1b4e0
thread = <optimized out>
setup_start = <optimized out>
thr_error = <optimized out>
urgent = <optimized out>
#6 migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at
../migration/migration.c:4124
iter_state = <optimized out>
s = 0x2aa00d1b4e0
--Type <RET> for more, q to quit, c to continue without paging--
thread = <optimized out>
setup_start = <optimized out>
thr_error = <optimized out>
urgent = <optimized out>
#7 0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at
../util/qemu-thread-posix.c:541
__cancel_buf =
{__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs =
{4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824,
2929182933488, 4396843986792, 4397299389455, 33679382915066768,
33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0,
4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}},
__pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
__cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
__not_first_call = <optimized out>
start_routine = 0x2aa004e08f0 <migration_thread>
arg = 0x2aa00d1b4e0
r = <optimized out>
#8 0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
#9 0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6
It looks like it's in the preempt test:
(gdb) where
#0 0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
#1 0x000003ffb1750890 in raise () from /lib64/libc.so.6
#2 0x000003ffb172a340 in abort () from /lib64/libc.so.6
#3 0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at
../tests/qtest/libqtest.c:194
#4 0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
#5 <signal handler called>
#6 0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
#7 0x000003ffb1750890 in raise () from /lib64/libc.so.6
#8 0x000003ffb172a340 in abort () from /lib64/libc.so.6
#9 0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at
../tests/qtest/libqmp.c:80
#10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at
../tests/qtest/libqtest.c:713
#11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
#12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{
'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
at ../tests/qtest/libqtest.c:765
#13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700,
command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
at ../tests/qtest/migration-helpers.c:73
#14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at
../tests/qtest/migration-helpers.c:139
#15 migrate_query_status (who=who@entry=0x2aa01eb2700) at
../tests/qtest/migration-helpers.c:161
#16 0x000002aa00414480 in check_migration_status (ungoals=0x0,
goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at
../tests/qtest/migration-helpers.c:177
#17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>,
ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
#18 0x000002aa0041300e in migrate_postcopy_complete
(from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000,
args=args@entry=0x3ffc247cf48)
at ../tests/qtest/migration-test.c:1137
#19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at
../tests/qtest/migration-test.c:1162
#20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178
Looking at the iov and file it's garbage; so it makes me think this is
something like a flush on a closed file.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
- s390 migration crash,
Dr. David Alan Gilbert <=
Re: s390 migration crash, Daniel P . Berrangé, 2023/03/22