qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM sys


From: Rafael David Tinoco
Subject: [Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system
Date: Tue, 10 Sep 2019 14:16:48 -0000

Alright, here is what is happening:

Whenever program is stuck, thread #2 backtrace is this:

(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000aaaaaabd41b0 in qemu_futex_wait (val=<optimized out>, f=<optimized 
out>) at ./util/qemu-thread-posix.c:438
#2  qemu_event_wait (ev=ev@entry=0xaaaaaac87ce8 <rcu_call_ready_event>) at 
./util/qemu-thread-posix.c:442
#3  0x0000aaaaaabee03c in call_rcu_thread (opaque=opaque@entry=0x0) at 
./util/rcu.c:261
#4  0x0000aaaaaabd34c8 in qemu_thread_start (args=<optimized out>) at 
./util/qemu-thread-posix.c:498
#5  0x0000ffffbf26a880 in start_thread (arg=0xfffffffff5bf) at 
pthread_create.c:486
#6  0x0000ffffbf1c4b9c in thread_start () at 
../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Meaning that code is waiting for a futex inside kernel.

(gdb) print rcu_call_ready_event
$4 = {value = 4294967295, initialized = true}

The QemuEvent "rcu_call_ready_event->value" is set to INT_MAX and I
don't know why yet.

rcu_call_ready_event->value is only touched by:

qemu_event_init() -> bool init ? EV_SET : EV_FREE
qemu_event_reset() -> atomic_or(&ev->value, EV_FREE)
qemu_event_set() -> atomic_xchg(&ev->value, EV_SET)
qemu_event_wait() -> atomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY)'

And there should be no 0x7fff value for "ev->value".

qemu_event_init() is the one initializing the global:

    static QemuEvent rcu_call_ready_event;

and it is called by "rcu_init_complete()" which is called by
"rcu_init()":

    static void __attribute__((__constructor__)) rcu_init(void)

a constructor function.

So, "fixing" this issue by:

    (gdb) print rcu_call_ready_event
    $8 = {value = 4294967295, initialized = true}
    
    (gdb) watch rcu_call_ready_event
    Hardware watchpoint 1: rcu_call_ready_event
    
    (gdb) set rcu_call_ready_event.initialized = 1
    
    (gdb) set rcu_call_ready_event.value = 0

and note that I added a watchpoint to rcu_call_ready_event global:

<HANG>

Thread 1 "qemu-img" received signal SIGINT, Interrupt.
(gdb) thread 2
[Switching to thread 2 (Thread 0xffffbec61d90 (LWP 33625))]

(gdb) bt
#0  0x0000aaaaaabd4110 in qemu_event_reset (ev=ev@entry=0xaaaaaac87ce8 
<rcu_call_ready_event>)
#1  0x0000aaaaaabedff8 in call_rcu_thread (opaque=opaque@entry=0x0) at 
./util/rcu.c:255
#2  0x0000aaaaaabd34c8 in qemu_thread_start (args=<optimized out>) at 
./util/qemu-thread-posix.c:498
#3  0x0000ffffbf26a880 in start_thread (arg=0xfffffffff5bf) at 
pthread_create.c:486
#4  0x0000ffffbf1c4b9c in thread_start () at 
../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) print rcu_call_ready_event
$9 = {value = 0, initialized = true}

You can see I advanced in the qemu_event_{reset,set,wait} logic.

    (gdb) disassemble /m 0x0000aaaaaabd4110
    Dump of assembler code for function qemu_event_reset:
    408     in ./util/qemu-thread-posix.c

    409     in ./util/qemu-thread-posix.c

    410     in ./util/qemu-thread-posix.c
    411     in ./util/qemu-thread-posix.c
       0x0000aaaaaabd40f0 <+0>:     ldrb    w1, [x0, #4]
       0x0000aaaaaabd40f4 <+4>:     cbz     w1, 0xaaaaaabd411c 
<qemu_event_reset+44>
       0x0000aaaaaabd411c <+44>:    stp     x29, x30, [sp, #-16]!
       0x0000aaaaaabd4120 <+48>:    adrp    x3, 0xaaaaaac20000
       0x0000aaaaaabd4124 <+52>:    add     x3, x3, #0x908
       0x0000aaaaaabd4128 <+56>:    mov     x29, sp
       0x0000aaaaaabd412c <+60>:    adrp    x1, 0xaaaaaac20000
       0x0000aaaaaabd4130 <+64>:    adrp    x0, 0xaaaaaac20000
       0x0000aaaaaabd4134 <+68>:    add     x3, x3, #0x290
       0x0000aaaaaabd4138 <+72>:    add     x1, x1, #0xc00
       0x0000aaaaaabd413c <+76>:    add     x0, x0, #0xd40
       0x0000aaaaaabd4140 <+80>:    mov     w2, #0x19b    // #411
       0x0000aaaaaabd4144 <+84>:    bl      0xaaaaaaaff190 <__assert_fail@plt>

    412     in ./util/qemu-thread-posix.c
       0x0000aaaaaabd40f8 <+8>:     ldr     w1, [x0]

    413     in ./util/qemu-thread-posix.c
       0x0000aaaaaabd40fc <+12>:    dmb     ishld

    414     in ./util/qemu-thread-posix.c
       0x0000aaaaaabd4100 <+16>:    cbz     w1, 0xaaaaaabd4108 
<qemu_event_reset+24>
       0x0000aaaaaabd4104 <+20>:    ret
       0x0000aaaaaabd4108 <+24>:    ldaxr   w1, [x0]
       0x0000aaaaaabd410c <+28>:    orr     w1, w1, #0x1
    => 0x0000aaaaaabd4110 <+32>:    stlxr   w2, w1, [x0]
       0x0000aaaaaabd4114 <+36>:    cbnz    w2, 0xaaaaaabd4108 
<qemu_event_reset+24>
       0x0000aaaaaabd4118 <+40>:    ret

And I'm currently inside the STLXR and LDAXR logic. To make sure my program 
counter is advancing, I added a breakpoint at 0x0000aaaaaabd4108, so CBNZ 
instruction would branch indefinitely into LDXAR instruction again, until the
LDAXR<->STLXR logic is satisfied (inside qemu_event_wait()).

(gdb) break *(0x0000aaaaaabd4108)
Breakpoint 2 at 0xaaaaaabd4108: file ./util/qemu-thread-posix.c, line 414.

which is basically this:

    if (value == EV_SET) {                        EV_SET == 0
        atomic_or(&ev->value, EV_FREE);           EV_FREE = 1
    }
    
and we can see that this logic being called one time after another:

(gdb) c
Thread 2 "qemu-img" hit Breakpoint 3, 0x0000aaaaaabd4108 in qemu_event_reset (
    ev=ev@entry=0xaaaaaac87ce8 <rcu_call_ready_event>) at 
./util/qemu-thread-posix.c:414

(gdb) c
Thread 2 "qemu-img" hit Breakpoint 3, 0x0000aaaaaabd4108 in qemu_event_reset (
    ev=ev@entry=0xaaaaaac87ce8 <rcu_call_ready_event>) at 
./util/qemu-thread-posix.c:414
    
(gdb) c
Thread 2 "qemu-img" hit Breakpoint 3, 0x0000aaaaaabd4108 in qemu_event_reset (
    ev=ev@entry=0xaaaaaac87ce8 <rcu_call_ready_event>) at 
./util/qemu-thread-posix.c:414

EVEN when rcu_call_ready_event->value is already EV_SET (0):

(gdb) print rcu_call_ready_event
$11 = {value = 0, initialized = true}

(gdb) info break
Num     Type           Disp Enb Address            What
1       hw watchpoint  keep y                      rcu_call_ready_event
3       breakpoint     keep n   0x0000aaaaaabd4108 qemu-thread-posix.c:414
        breakpoint already hit 23 times
4       breakpoint     keep y   0x0000aaaaaabd4148 qemu-thread-posix.c:424

IF I enable only rcu_call_ready_event HW watchpoint, nothing is
triggered.

(gdb) watch *(rcu_call_ready_event->value)
Hardware watchpoint 6: *(rcu_call_ready_event->value)

not if I set it directly to QemuEvent->value...

    assert(ev->initialized);
    value = atomic_read(&ev->value);
    smp_mb_acquire();
    if (value == EV_SET) {
        atomic_or(&ev->value, EV_FREE);
    }

meaning that "value" and "ev->value" might have a diff value... is that
so ?

(gdb) print value
$14 = <optimized out>

can't say.. checking registers AND stack:

       0x0000aaaaaabd4100 <+16>:    cbz     w1, 0xaaaaaabd4108 
<qemu_event_reset+24>
       0x0000aaaaaabd4104 <+20>:    ret
       0x0000aaaaaabd4108 <+24>:    ldaxr   w1, [x0]
       0x0000aaaaaabd410c <+28>:    orr     w1, w1, #0x1
    => 0x0000aaaaaabd4110 <+32>:    stlxr   w2, w1, [x0]
       0x0000aaaaaabd4114 <+36>:    cbnz    w2, 0xaaaaaabd4108 
       

x0             0xaaaaaac87ce8      187649986428136
x1             0x1                 1
x2             0x1                 1
x3             0x0                 0
x4             0xffffbec61e98      281473882398360
x5             0xffffbec61c90      281473882397840
x6             0xffffbec61c90      281473882397840
x7             0x1                 1
x8             0x65                101
x9             0x0                 0
x10            0x0                 0
x11            0x0                 0
x12            0xffffbec61d90      281473882398096
x13            0x0                 0
x14            0x0                 0
x15            0x2                 2
x16            0xffffbf67ccf0      281473892994288
x17            0xffffbf274938      281473888766264
x18            0x23f               575
x19            0x0                 0
x20            0xaaaaaac87ce8      187649986428136
x21            0x0                 0
x22            0xfffffffff5bf      281474976708031
x23            0xaaaaaac87ce0      187649986428128
x24            0xaaaaaac29000      187649986039808
x25            0xfffffffff658      281474976708184
x26            0x1000              4096
x27            0xffffbf28c000      281473888862208
x28            0xffffbec61d90      281473882398096
x29            0xffffbec61420      281473882395680
x30            0xaaaaaabedff8      187649985798136
sp             0xffffbec61420      0xffffbec61420
pc             0xaaaaaabd4110      0xaaaaaabd4110 <qemu_event_reset+32>
cpsr           0x0                 [ EL=0 ]
fpsr           0x0                 0
fpcr           0x0                 0

AND the ORR instruction is ALWAYS being executed against 0x1 (not 0x0,
which is what I just changed by changing .value):

(gdb) print value
$14 = <optimized out>

       0x0000aaaaaabd410c <+28>:    orr     w1, w1, #0x1
       
#0x1 is being used instead of contents of "value" local variable (volatile).

I'll recompile QEMU flagging all those local "unsigned value" variables
as being volatile and check if optimization changes. Or even try to
disable optimizations.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on high core count ARM system

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0x0000ffffae4f8154 in __GI_ppoll (fds=0xaaaae8a67dc0, 
nfds=187650274213760, 
      timeout=<optimized out>, timeout@entry=0x0, sigmask=0xffffc123b950)
      at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0x0000aaaabbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized 
out>, 
      __fds=<optimized out>) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, 
      timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0x0000aaaabbefbf80 in os_host_main_loop_wait (timeout=-1)
      at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497
  #5  0x0000aaaabbe2aa30 in convert_do_copy (s=0xffffc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=<optimized out>, argv=<optimized out>) at 
qemu-img.c:2456
  #7  0x0000aaaabbe2333c in main (argc=7, argv=<optimized out>) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions



reply via email to

[Prev in Thread] Current Thread [Next in Thread]