bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hung threads


From: Linas Vepstas
Subject: Re: Hung threads
Date: Sun, 16 Nov 2008 10:45:05 -0600

Hi,

2008/11/14 Linas Vepstas <address@hidden>:
> Here's a deadlock I saw today.

Here's a different deadlock that is fully debugged. The
guilty code leading to the deadlock is in make_struct(),
in struct.c circa line 463, which tries to alloc memory
while holding a CRITICAL_SECTION lock.  Of course,
everything deadlocks in GC.

I am trying to figure out how to fix this now. its kind
of gnarly.

Summary:
thread 7 -- holding critical section lock, sleeping on &scm_i_sweep_mutex
thread 5 -- holding heap_mutex, sleeping on critical section
thread 12 -- holding &scm_i_sweep_mutex, sleeping on heap_mutex


^C
Program received signal SIGINT, Interrupt.
[Switching to Thread 0xf79f86c0 (LWP 10364)]
0xffffe425 in __kernel_vsyscall ()
(gdb) info threads
  12 Thread 0xf24fdb90 (LWP 10395)  0xffffe425 in __kernel_vsyscall ()
  10 Thread 0xf34ffb90 (LWP 10389)  0xffffe425 in __kernel_vsyscall ()
  9 Thread 0xf3e7db90 (LWP 10387)  0xffffe425 in __kernel_vsyscall ()
  7 Thread 0xf4e7fb90 (LWP 10380)  0xffffe425 in __kernel_vsyscall ()
  6 Thread 0xf5680b90 (LWP 10377)  0xffffe425 in __kernel_vsyscall ()
  5 Thread 0xf5ec4b90 (LWP 10374)  0xffffe425 in __kernel_vsyscall ()
  2 Thread 0xf76c7b90 (LWP 10365)  0xffffe425 in __kernel_vsyscall ()
* 1 Thread 0xf79f86c0 (LWP 10364)  0xffffe425 in __kernel_vsyscall ()

lock summary
thread 5 -- holding heap_mutex
            sleeping on CRITICAL_SECTION in scm_c_catch

thread 6 -- holds no locks
            sleeping on &scm_i_sweep_mutex in increase_mtrigger

thread 7 -- holding critical section lock  in make_struct
            then tries to alloc mem ... !!!!!
            sleeping on &scm_i_sweep_mutex in increase_mtrigger

thread 9 -- holding heap_mutex
            sleeping on CRITICAL SECTION in scm_c_catch

thread 10 -- holding ?? in  increase_mtrigger
            called from scm_gc_register_collectable_memory
            sleeping on &scm_i_sweep_mutex in increase_mtrigger

thread 12 --  trying to put everything to sleep,
             holding admin mutex, and most heap_mutexes
             sleeping on remaining heap_mutexes that are still held.

The guilty party is thread 7 which tries to alloc memory while holding
a critical section lock. This leads to a deadlock:

thread 7 -- holding critical section lock, sleeping on &scm_i_sweep_mutex
thread 5 -- holding heap_mutex, sleeping on critical section
thread 12 -- holding &scm_i_sweep_mutex, sleeping on heap_mutex


(gdb) call prt_lockholders()

Thread 0xf5680b90 -- thread 6

Thread 0xf34ffb90  -- thread 10
0: mutex (0xf7816d0c) in:
        /usr/lib/libguile.so.17 [0xf778f9f1]
        c_register_collectable_memory+0x2a) [0xf778fbca]
        /libguile.so.17(scm_gc_malloc+0x40) [0xf7790010]
        .so.17(scm_gc_calloc+0x2c) [0xf77901bc]

Thread 0xf24fdb90 -- thread 12
0: mutex (0xf7812780) in: -- this is the scm_i_sweep_mutex
        /usr/lib/libguile.so.17(scm_i_thread_put_to_sleep+0x7f) [0xf77e395f]
        f]
        x19) [0xf778db29]
        (scm_gc_register_collectable_memory+0x2a) [0xf778fbca]
1: &thread_admin_mutex (0xf78191ec) in:
        /usr/lib/libguile.so.17(scm_i_thread_put_to_sleep+0x7f) [0xf77e395f]
        /usr/lib/libguile.so.17(scm_i_gc+0x19) [0xf778db29]
        /usr/lib/libguile.so.17 [0xf778fa5c]
        /usr/lib/libguile.so.17(scm_gc_register_collectable_memory+0x2a)
[0xf778fbca]
2: &t->heap_mutex (0x8c11484) in:
        /usr/lib/libguile.so.17(scm_i_thread_put_to_sleep+0x7f) [0xf77e395f]
        /usr/lib/libguile.so.17(scm_i_gc+0x19) [0xf778db29]
        /usr/lib/libguile.so.17 [0xf778fa5c]
        /usr/lib/libguile.so.17(scm_gc_register_collectable_memory+0x2a)
[0xf778fbca]
3: &t->heap_mutex (0x8c682bc) in:
        /usr/lib/libguile.so.17(scm_i_thread_put_to_sleep+0x7f) [0xf77e395f]
        /usr/lib/libguile.so.17(scm_i_gc+0x19) [0xf778db29]
        /usr/lib/libguile.so.17 [0xf778fa5c]
        /usr/lib/libguile.so.17(scm_gc_register_collectable_memory+0x2a)
[0xf778fbca]
4: &t->heap_mutex (0xf3518c74) in:
        /usr/lib/libguile.so.17(scm_i_thread_put_to_sleep+0x7f) [0xf77e395f]
        /usr/lib/libguile.so.17(scm_i_gc+0x19) [0xf778db29]
        /usr/lib/libguile.so.17 [0xf778fa5c]
        /usr/lib/libguile.so.17(scm_gc_register_collectable_memory+0x2a)
[0xf778fbca]
5: &t->heap_mutex (0x8c10e84) in:
        /usr/lib/libguile.so.17(scm_i_thread_put_to_sleep+0x7f) [0xf77e395f]
        /usr/lib/libguile.so.17(scm_i_gc+0x19) [0xf778db29]
        /usr/lib/libguile.so.17 [0xf778fa5c]
        /usr/lib/libguile.so.17(scm_gc_register_collectable_memory+0x2a)
[0xf778fbca]

Thread 0xf4e7fb90 -- thread 7
0: &scm_i_critical_section_mutex (0xf781c808) in:
        /usr/lib/libguile.so.17(scm_make_struct+0xe2) [0xf77e1d52]
        /usr/lib/libguile.so.17(scm_make_stack+0x181) [0xf77c6b11]
        
/home/linas/src/novamente/src/opencog-stage4/staging/bin/opencog/guile/libsmob.so(_ZN7opencog10SchemeEval17preunwind_handlerEP17scm_unused_structS2_+0x26)
[0xf7879696]
        
/home/linas/src/novamente/src/opencog-stage4/staging/bin/opencog/guile/libsmob.so(_ZN7opencog10SchemeEval25preunwind_handler_wrapperEPvP17scm_unused_structS3_+0x31)
[0xf78796db]

Thread 0xf3e7db90 -- thread 9
0: &t->heap_mutex (0x8c676bc) in:
        /usr/lib/libguile.so.17 [0xf778f9f1]
        /usr/lib/libguile.so.17(scm_gc_register_collectable_memory+0x2a)
[0xf778fbca]
        0xf778fbca]
        f7790010]

Thread 0xf5ec4b90 -- thread 5
0: &t->heap_mutex (0x8c66dec) in:
        /usr/lib/libguile.so.17 [0xf778f9f1]
        /usr/lib/libguile.so.17(scm_gc_register_collectable_memory+0x2a)
[0xf778fbca]
        0xf778fbca]
        f7790010]
(gdb)



Here's the deadlock:
(gdb) thread 7
[Switching to thread 7 (Thread 0xf4e7fb90 (LWP 10380))]#0  0xffffe425 in
__kernel_vsyscall ()
(gdb) bt
#0  0xffffe425 in __kernel_vsyscall ()
#1  0xf7e3e589 in __lll_lock_wait () from
/lib/tls/i686/cmov/libpthread.so.0
#2  0xf7e39ba6 in _L_lock_95 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0xf7e3958a in pthread_mutex_lock () from
/lib/tls/i686/cmov/libpthread.so.0
#4  0xf77fd32f in scm_i_pthread_mutex_lock_dbg (mtx=0xf7812780,
    lockstr=0xf7805b2e "mutex") at debug-locks.c:38
#5  0xf77e3a06 in scm_pthread_mutex_lock (mutex=0xf7812780) at
threads.c:1477
#6  0xf778fa37 in increase_mtrigger (size=<value optimized out>,
    what=0xf7805837 "struct") at gc-malloc.c:234
#7  0xf778fbca in scm_gc_register_collectable_memory (mem=0x8be4000,
size=39,
    what=0xf7805837 "struct") at gc-malloc.c:288
#8  0xf7790010 in scm_gc_malloc (size=39, what=0xf7805837 "struct")
    at gc-malloc.c:321
#9  0xf77e1109 in scm_alloc_struct (n_words=4, n_extra=4,
    what=0xf7805837 "struct") at struct.c:298
#10 0xf77e1e80 in scm_make_struct (vtable=0x8c0a3b0,
tail_array_size=0x2,
    init=0x404) at struct.c:463
#11 0xf77c6b11 in scm_make_stack (obj=0x104, args=0x404) at stacks.c:464
#12 0xf7879696 in opencog::SchemeEval::preunwind_handler
(this=0x8c2ca80,



(gdb) thread 5
[Switching to thread 5 (Thread 0xf5ec4b90 (LWP 10374))]#0  0xffffe425 in
__kernel_vsyscall ()
(gdb) bt
#0  0xffffe425 in __kernel_vsyscall ()
#1  0xf7e3e589 in __lll_lock_wait () from
/lib/tls/i686/cmov/libpthread.so.0
#2  0xf7e39bb4 in _L_lock_236 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0xf7e3960b in pthread_mutex_lock () from
/lib/tls/i686/cmov/libpthread.so.0
#4  0xf77fd32f in scm_i_pthread_mutex_lock_dbg (mtx=0xf781c808,
    lockstr=0xf77ff15c "&scm_i_critical_section_mutex") at
debug-locks.c:38
#5  0xf77e6304 in scm_c_catch (tag=0x104, body=0xf77e03f0
<scm_c_eval_string>,
    body_data=0x8c69244,


(gdb) thread 12
[Switching to thread 12 (Thread 0xf24fdb90 (LWP 10395))]#0  0xffffe425
in __kernel_vsyscall ()
(gdb) bt
#0  0xffffe425 in __kernel_vsyscall ()
#1  0xf7e3e589 in __lll_lock_wait () from
/lib/tls/i686/cmov/libpthread.so.0
#2  0xf7e39ba6 in _L_lock_95 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0xf7e3958a in pthread_mutex_lock () from
/lib/tls/i686/cmov/libpthread.so.0
#4  0xf77fd32f in scm_i_pthread_mutex_lock_dbg (mtx=0x8c676bc,
    lockstr=0xf780ce25 "&t->heap_mutex") at debug-locks.c:38
#5  0xf77e395f in scm_i_thread_put_to_sleep () at threads.c:1621
#6  0xf778db29 in scm_i_gc (what=0xf780a581 "string") at gc.c:552
#7  0xf778fa5c in increase_mtrigger (size=<value optimized out>,
    what=0xf780a581 "string") at gc-malloc.c:238
#8  0xf778fbca in scm_gc_register_collectable_memory (mem=0xf351cd60,
size=2202,
    what=0xf780a581 "string") at gc-malloc.c:288
#9  0xf7790010 in scm_gc_malloc (size=2202, what=0xf780a581 "string")
    at gc-malloc.c:321
#10 0xf77c9108 in make_stringbuf (len=2201) at strings.c:118
#11 0xf77c93b5 in scm_i_make_string (len=2201, charsp=0xf24fc688)
    at strings.c:185
#12 0xf77c96f1 in scm_from_locale_stringn (
    str=0xf35053bc "(define (




reply via email to

[Prev in Thread] Current Thread [Next in Thread]