[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Linphone-users] Linphone deadlock

From: Simon MORLAT
Subject: Re: [Linphone-users] Linphone deadlock
Date: Mon, 05 Apr 2004 14:16:10 +0200
User-agent: Mozilla Thunderbird 0.5 (X11/20040306)

Hello Matt,

This deadlock is really a nigthmare!
Hopefully the next release of linphone will use eXosip, and the one thread event api simplifies those complex threaded design. For now, maybe the smutex_lock(m->mutex) on osipua_distribute_event() is not necessary. Maybe it's because it's been years that I did not get through this code, but I can't see why the manager mutex is necessary in this function.
Try that and let me know what it gives !
Good luck

Matt Lawson wrote:

Hello Simon (and others!),

I haven't written in a while for the simple fact that Linphone has been working flawlessly. I haven't had to recompile it in 4 months. Until now of course. I've run into a deadlock.

The version I have is a little bit old, so forgive me if this has been fixed in a newer version. I compared it to 0.12.2 and the affected portions seem to be the same. Also, I am using a custom program which functions exactly like linphonec to issue the hangup commands (that calls linphone_core_terminate_dialog) but is not actually linphonec.

Here's the stack trace of the two deadlocked threads:

(gdb) bt
#0  0x40022364 in __pthread_sigsuspend () from /lib/
#1  0xbf7ff834 in ?? ()
#2 0x40022128 in __pthread_wait_for_restart_signal () from /lib/
#3  0x40023b99 in __pthread_alt_lock () from /lib/
#4  0x40020957 in pthread_mutex_lock () from /lib/
#5 0x4007223a in bye_cb (dialog=0x80bed08, trn=0x80e0338, msg=0xfffffffc, p2=0x0) at osipuacb.c:103 #6 0x4077c5cf in nist_bye_received (trn=0x80e0338, sipmsg=0x80dfd70) at nist_callbacks.c:78
#7  0x40ae2986 in nist_rcv_request () from /usr/lib/
#8  0x40ae38d9 in fsm_callmethod () from /usr/lib/
#9  0x40ae55cb in transaction_execute () from /usr/lib/
#10 0x40780c7f in osipua_distribute_event (m=0x809c5b0, ev=0x80dfe50) at udp.c:95
#11 0x40780f89 in sipd_thread (managerp=0x809c5b0) at udp.c:216
#12 0x4001fc00 in pthread_start_thread () from /lib/
#13 0x4001fc7f in pthread_start_thread_event () from /lib/
(gdb) thread 6
[Switching to thread 6 (Thread 65541 (LWP 1370))]#0 0x40022364 in __pthread_sigsuspend () from /lib/
(gdb) bt
#0  0x40022364 in __pthread_sigsuspend () from /lib/
#1  0xbf1ff7a4 in ?? ()
#2 0x40022128 in __pthread_wait_for_restart_signal () from /lib/
#3  0x40023b99 in __pthread_alt_lock () from /lib/
#4  0x40020957 in pthread_mutex_lock () from /lib/
#5  0x40ae9392 in smutex_lock () from /usr/lib/
#6 0x40781fce in ua_transaction_execute (trn=0x809dea0, ev=0x80d9a78) at uatransaction.c:325 #7 0x4077fd66 in osip_dialog_send_request (call_leg=0x80c1648, sipmsg=0x80c7f98) at osipdialog.c:1830 #8 0x4077eb3a in osip_dialog_bye (call_leg=0x80c1648) at osipdialog.c:951 #9 0x40071249 in linphone_core_terminate_dialog (lc=0x8097fac, url=0x0) at linphonecore.c:1062

Here's an explanation of what happens:

The top thread above does this:
1.  osipua_distribute_event locks smutex_lock(m->mutex)
2. The logic eventually finds its way to bye_cb which deadlocks on: linphone_core_lock(lc) because it can never acquire the lock

The lower thread above does this:
1. linphone_core_terminate_dialog locks: linphone_core_lock(lc);
2. The logic finds its way to ua_transaction_execute which deadlocks on smutex_lock(manager->mutex)

You may think it very unlikely that this could happen, but it does, and here's how:

We have multiple Linphones connected to the same Asterisk server. We can initiate a call between the two Linphones. We also have a custom function that does a "hangup" on all Linphones at one time. So this Linphone is trying to hang up at the exact moment it is receiving a hangup message from the other one. It is surprisingly reproducible. I can reproduce it 75% of the time, probably.

I'm just not sure what the best way to fix this is.  Ideas?

- Matt

Linphone-users mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]