qemu-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue about QEMU multi-core in TCG mode


From: Kaifeng Xu
Subject: Issue about QEMU multi-core in TCG mode
Date: Fri, 18 Jun 2021 04:27:39 -0400

Hi,
I am using QEMU 5.05 and I keep getting some issues when running QEMU under multi-core configuration. I used a qcow2 ubuntu 18.04 image and launched the VM in QEMU TCG mode, having the multicore configuration. Because I need to do tracing at some point, I need to run it in TCG mode. Without multicore configuration (-smp 8), everything works fine, the VM can work as normal. However, when I added "-smp 8", I kept getting some bug logs.

Here is my command to launch the VM:


qemu-system-x86_64 \
    -cpu qemu64,+pcid \
    -smp 8 \
    -m 8G \
    -drive if=virtio,file=${DISK},cache=none \
    -device pqii \
    -trace events=`pwd`/events \
    -net user,hostfwd=tcp::10022-:22 \
    -net nic \
    -display none \
    -nographic \
    -D ${LOG_FILE}

And here is the bug logs I got:

[10782.933762] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[10782.933972] IP:           (null)
[10782.934061] PGD 0 P4D 0
[10782.934136] Oops: 0010 [#1] SMP NOPTI
[10782.934236] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_
[10782.935578] CPU: 7 PID: 2618 Comm: systemd Tainted: G        W        4.15.0-144-generic #148-Ubuntu
[10782.935821] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
[10782.936001] RIP: 0010:          (null)
[10782.936001] RSP: 0018:ffffb147c31d7e68 EFLAGS: 2f80a606
[10782.936001] RAX: 00007f1e749c6d5e RBX: ffff8dea33cc8000 RCX: 0000000000000000
[10782.936001] RDX: ffffb147c31d7e68 RSI: ffffb147c31d7f58 RDI: ffffb147c31d7df8
[10782.936001] RBP: 0000000000000000 R08: 00000000000261a0 R09: ffffffffa9492f7c
[10782.936001] R10: ffffb147c31d7eb8 R11: d350b3de8b0fabc0 R12: ffffb147c31d7f58
[10782.936001] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[10782.936001] FS:  00007f1e74eb8dc0(0000) GS:ffff8dea3fdc0000(0000) knlGS:0000000000000000
[10782.936001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10782.936001] CR2: 0000000000000000 CR3: 800000022ef00005 CR4: 00000000000206e0
[10782.936001] Call Trace:
[10782.936001] WARNING: kernel stack frame pointer at 00000000f55351d1 in systemd:2618 has bad value           (null)
[10782.936001] unwind stack type:0 next_sp:          (null) mask:0x2 graph_idx:0
[10782.936001] 00000000f55351d1: 0000000000000000 ...
[10782.936001] 00000000b9b96aa6: 0000000000000005 (0x5)
[10782.936001] 000000000afe1d6d: 0000000000000001 (0x1)
[10782.936001] 000000003c53ff11: 00007f1e749c6d5e (0x7f1e749c6d5e)
[10782.936001] 00000000090dcad6: 00000000fffffffe (0xfffffffe)
[10782.936001] 0000000038177723: ffff8de9c0f9c000 (0xffff8de9c0f9c000)
[10782.936001] 0000000062f892b0: ffffb147c31d7ea8 (0xffffb147c31d7ea8)
[10782.936001] 00000000703f0b6e: ffffffffa9492f7c (putname+0x4c/0x60)
[10782.936001] 000000003a8885a5: 00000000000a8100 (0xa8100)
[10782.936001] 000000004a37862b: ffffb147c31d7f18 (0xffffb147c31d7f18)
[10782.936001] 00000000153b5649: ffffffffa947f07d (do_sys_open+0x13d/0x2c0)
[10782.936001] 000000006c356c4d: 0000000000000000 ...
[10782.936001] 00000000ec5d62fa: 00028100c31d7f28 (0x28100c31d7f28)
[10782.936001] 00000000cc1cadac: 00000004a9820000 (0x4a9820000)
[10782.936001] 00000000a86b0798: 0000000000000000 ...
[10782.936001] 00000000a3eba124: ebe7118d6379f500 (0xebe7118d6379f500)
[10782.936001] 00000000443db980: 0000000080000010 (0x80000010)
[10782.936001] 000000009481abbc: ffffb147c31d7f28 (0xffffb147c31d7f28)
[10782.936001] 00000000aa081a54: ffffffffa92036b0 (syscall_slow_exit_work+0x50/0xd0)
[10782.936001] 00000000d4d02dc4: ffffb147c31d7f58 (0xffffb147c31d7f58)
[10782.936001] 00000000dac4a517: 0000000000000000 ...
[10782.936001] 000000001c70e67d: ffffb147c31d7f48 (0xffffb147c31d7f48)
[10782.936001] 00000000be5e9eb5: ffffffffa9203afb (do_syscall_64+0x12b/0x130)
[10782.936001] 000000004e226f17: 0000000000000000 ...
[10782.936001] 00000000fbf867f1: ffffffffa9c00085 (entry_SYSCALL_64_after_hwframe+0x41/0xa6)
[10782.936001] 00000000fedb0a26: 00005605be317430 (0x5605be317430)
[10782.936001] 000000009e41e2d0: 00007ffe32a84140 (0x7ffe32a84140)
[10782.936001] 00000000c80efe8c: 00005605be33b4f0 (0x5605be33b4f0)
[10782.936001] 000000005fccb0d0: 00005605be32bbc6 (0x5605be32bbc6)
[10782.936001] 00000000f9f33a54: 0000000000000008 (0x8)
[10782.936001] 000000002e8364ca: 00007ffe32a841b0 (0x7ffe32a841b0)
[10782.936001] 0000000074021ae6: 0000000000000246 (0x246)
[10782.936001] 000000001b70c12e: 0000000000000000 ...
[10782.936001] 0000000031ebce76: 0000000000000020 (0x20)
[10782.936001] 00000000306e523f: 00005605be317439 (0x5605be317439)
[10782.936001] 00000000b78ead6b: fffffffffffffffe (0xfffffffffffffffe)
[10782.936001] 00000000bbf8ea8b: 00007f1e749c6d5e (0x7f1e749c6d5e)
[10782.936001] 000000007395f0fc: 00000000000a0100 (0xa0100)
[10782.936001] 000000006e552588: 00005605be32bba0 (0x5605be32bba0)
[10782.936001] 00000000368934ac: 00000000ffffff9c (0xffffff9c)
[10782.936001] 000000003191ba85: 0000000000000101 (0x101)
[10782.936001] 00000000abd843ca: 00007f1e749c6d5e (0x7f1e749c6d5e)
[10782.936001] 00000000dc1f5f3b: 0000000000000033 (0x33)
[10782.936001] 000000008a0d9720: 0000000000000246 (0x246)
[10782.936001] 0000000059917d6a: 00007ffe32a840c0 (0x7ffe32a840c0)
[10782.936001] 00000000d394c43d: 000000000000002b (0x2b)
[10782.936001]  ? putname+0x4c/0x60
[10783.031397]  ? do_sys_open+0x13d/0x2c0
[10783.031397]  ? syscall_slow_exit_work+0x50/0xd0
[10783.031397]  ? do_syscall_64+0x12b/0x130
[10783.031397]  ? entry_SYSCALL_64_after_hwframe+0x41/0xa6
[10783.031397] Code:  Bad RIP value.
[10783.031397] RIP:           (null) RSP: ffffb147c31d7e68
[10783.031397] CR2: 0000000000000000
[10783.168692] ---[ end trace 2c27d7b7f0567492 ]---
[10783.187136] device vetha4687e8 left promiscuous mode
[10783.195831] docker0: port 9(vetha4687e8) entered disabled state

And I also tried using "-icount shift=0,align=off " to slow down the clock, but sometimes I got kernel panic problems.

 [ 1125.364015] PANIC: double fault, error_code: 0x0
[ 1125.364366] Kernel panic - not syncing: Machine halted.
[ 1125.364510] CPU: 1 PID: 20906 Comm: Process Metrics Tainted: G           OE    4.15.0-118-generic #119-Ubuntu
[ 1125.364773] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
[ 1125.365149] Call Trace:
[ 1125.365219]  <#DF>
[ 1125.365280]  dump_stack+0x6d/0x8e
[ 1125.365375]  panic+0xe4/0x254
[ 1125.365462]  df_debug+0x2d/0x30
[ 1125.365554]  do_double_fault+0xa1/0x130
[ 1125.365662]  double_fault+0x1e/0x30
[ 1125.365762] RIP: 0010:kmem_cache_alloc+0x9c/0x1c0
[ 1125.365890] RSP: 0018:ffffa0b58185fc90 EFLAGS: 774b5ae2
[ 1125.366033] RAX: ffff8d82774b5a01 RBX: ffff8d82774b4810 RCX: 0000000000032565
[ 1125.366232] RDX: 0000000000032564 RSI: 00000000014000c0 RDI: 0000333280028980
[ 1125.366424] RBP: ffffa0b58185fcc0 R08: ffffc0b57fc68980 R09: 0000000000000031
[ 1125.366615] R10: ffffa0b58185fe23 R11: ffffa0b58185fe26 R12: ffff8d82774b5ae0
[ 1125.366807] R13: 00000000014000c0 R14: ffff8d82f7165380 R15: ffff8d82ebcc0300
[ 1125.367001]  </#DF>
[ 1125.367065]  ? proc_alloc_inode+0x1a/0x60
[ 1125.367178]  ? tid_fd_revalidate+0x120/0x120
[ 1125.367298]  proc_alloc_inode+0x1a/0x60
[ 1125.367406]  alloc_inode+0x20/0x90
[ 1125.367503]  new_inode_pseudo+0x11/0x60
[ 1125.367534]  new_inode+0x19/0x30
[ 1125.367534]  proc_pid_make_inode+0x1b/0xb0
[ 1125.367534]  proc_fd_instantiate+0x25/0x90
[ 1125.367534]  proc_fill_cache+0x118/0x180
[ 1125.367534]  proc_readfd_common+0x172/0x1f0
[ 1125.367534]  ? tid_fd_revalidate+0x120/0x120
[ 1125.367534]  proc_readfd+0x15/0x20
[ 1125.367534]  iterate_dir+0x9e/0x1a0
[ 1125.367534]  SyS_getdents64+0xa0/0x130
[ 1125.367534]  ? verify_dirent_name+0x30/0x30
[ 1125.367534]  do_syscall_64+0x73/0x130
[ 1125.367534]  ? do_syscall_64+0x73/0x130
[ 1125.367534]  entry_SYSCALL_64_after_hwframe+0x41/0xa6
[ 1125.367534] RIP: 0033:0x7fc58a49c2e7
[ 1125.367534] RSP: 002b:00007fc5614947f8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
[ 1125.367534] RAX: ffffffffffffffda RBX: 00007fc48c056500 RCX: 00007fc58a49c2e7
[ 1125.367534] RDX: 0000000000008000 RSI: 00007fc48c056530 RDI: 0000000000000004
[ 1125.367534] RBP: 00007fc48c056530 R08: 0000000000000030 R09: 0000000000000003
[ 1125.367534] R10: 0000000000000000 R11: 0000000000000293 R12: ffffffffffffff80
[ 1125.367534] R13: 00007fc48c056504 R14: 0000000000000002 R15: 00007fc48c006e68
[ 1125.367534] Kernel Offset: 0x36200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1125.367534] ---[ end Kernel panic - not syncing: Machine halted.

I don't know if this is a bug in QEMU TCG mode, I will be really grateful if anyone can help me.  

Thanks for any help!

Best,
Kaifeng Xu

reply via email to

[Prev in Thread] Current Thread [Next in Thread]