bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#21965: 24.5; Emacs freezes when canceling at open file


From: Maneesh Yadav
Subject: bug#21965: 24.5; Emacs freezes when canceling at open file
Date: Thu, 18 Feb 2016 13:46:13 -0800

Apologies to all I haven't ben able to follow up on this more
thoroughly, part of the problem was trying to get the crash to
replicate.  I could do it in a few minutes while I was originally
posting, then as I was getting all the right debug statements in it
got harder and harder.  I decided to just revert to normal use and
wait for it to happen.  Just happened again and I've put all the
debugging info I can here and will try to trace through glib and
figure out what is going on, just putting everything here for
reference.

overrided the following macros in gmain.c (and had to add some curly braces):

#define LOCK_CONTEXT(context) g_mutex_lock (&context->mutex)
#define LOCK_CONTEXT(context) {fprintf(stderr, "MANEESH GLIB DEBUG:
About to LOCK: %p, %s, %d, %s, %p\n", context, __FILE__, __LINE__,
__FUNCTION__,g_thread_self()); g_mutex_lock (&context->mutex);}

#define UNLOCK_CONTEXT(context) g_mutex_unlock (&context->mutex)
#define UNLOCK_CONTEXT(context) {fprintf(stderr, "MANEESH GLIB DEBUG:
About to UNLOCK: %p, %s, %d, %s, %p\n", context, __FILE__, __LINE__,
__FUNCTION__, g_thread_self()); g_mutex_unlock (&context->mutex);}


At the time of crash, an abbreviated summary of stderr:
...
MANEESH GLIB DEBUG: About to UNLOCK: 0x100f00e20, gmain.c, 4128,
g_main_context_poll, 0x102001c00

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3222,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3538,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3583,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3859,
g_main_context_pending, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3782,
g_main_context_iterate, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3222,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3801,
g_main_context_iterate, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3812,
g_main_context_iterate, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3376,
g_main_context_prepare, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3501,
g_main_context_prepare, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3538,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3583,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 4124,
g_main_context_poll, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 4128,
g_main_context_poll, 0x100da5800

...

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3242,
g_main_context_release, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3265,
g_main_context_release, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800



lldb traces:

(lldb) attach emacs

Process 79773 stopped

* thread #1: tid = 0x6c3b4f, 0x00007fff8deb9166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

    frame #0: 0x00007fff8deb9166 libsystem_kernel.dylib`__psynch_mutexwait + 10

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8deb9166 <+10>: jae    0x7fff8deb9170            ; <+20>

    0x7fff8deb9168 <+12>: movq   %rax, %rdi

    0x7fff8deb916b <+15>: jmp    0x7fff8deb4c53            ; cerror_nocancel

    0x7fff8deb9170 <+20>: retq


Executable module set to "/opt/local/bin/emacs".

Architecture set to: x86_64h-apple-macosx.

(lldb) thread backtrace all

* thread #1: tid = 0x6c3b4f, 0x00007fff8deb9166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

  * frame #0: 0x00007fff8deb9166 libsystem_kernel.dylib`__psynch_mutexwait + 10

    frame #1: 0x00007fff88a0d696
libsystem_pthread.dylib`_pthread_mutex_lock + 480

    frame #2: 0x0000000100a17b48 libglib-2.0.0.dylib`g_mutex_lock + 26

    frame #3: 0x00000001009d9b53
libglib-2.0.0.dylib`g_main_context_acquire + 109

    frame #4: 0x000000010024fc47 emacs`xg_select + 231

    frame #5: 0x0000000100225c3d emacs`wait_reading_process_output + 3757

    frame #6: 0x0000000100008cb6 emacs`sit_for + 582

    frame #7: 0x0000000100108f00 emacs`read_char + 4496

    frame #8: 0x0000000100104edd emacs`read_key_sequence + 1757

    frame #9: 0x0000000100103cec emacs`command_loop_1 + 1212

    frame #10: 0x00000001001bf04e emacs`internal_condition_case + 382

    frame #11: 0x000000010011ce09 emacs`command_loop_2 + 41

    frame #12: 0x00000001001be696 emacs`internal_catch + 342

    frame #13: 0x0000000100102ddb emacs`command_loop + 187

    frame #14: 0x0000000100102c9f emacs`recursive_edit_1 + 127

    frame #15: 0x0000000100102f87 emacs`Frecursive_edit + 327

    frame #16: 0x0000000100100fd3 emacs`main + 4387

    frame #17: 0x00007fff8a6d25c9 libdyld.dylib`start + 1

    frame #18: 0x00007fff8a6d25c9 libdyld.dylib`start + 1

 thread #2: tid = 0x6c3b6b, 0x00007fff8deb93fa
libsystem_kernel.dylib`__select + 10, name = 'gmain'

    frame #0: 0x00007fff8deb93fa libsystem_kernel.dylib`__select + 10

    frame #1: 0x00000001009e8bbd libglib-2.0.0.dylib`g_poll + 399

    frame #2: 0x00000001009dd07c
libglib-2.0.0.dylib`g_main_context_iterate + 845

    frame #3: 0x00000001009dd1b1
libglib-2.0.0.dylib`g_main_context_iteration + 127

    frame #4: 0x00000001009de796 libglib-2.0.0.dylib`glib_worker_main + 53

    frame #5: 0x00000001009fddd9 libglib-2.0.0.dylib`g_thread_proxy + 90

    frame #6: 0x00007fff88a1005a libsystem_pthread.dylib`_pthread_body + 131

    frame #7: 0x00007fff88a0ffd7 libsystem_pthread.dylib`_pthread_start + 176

    frame #8: 0x00007fff88a0d3ed libsystem_pthread.dylib`thread_start + 13


Inkscape is the only other binary linked to glib that is running, I think:

Maneeshs-MacBook-Air:~ maneeshyadav$ ps

  PID TTY           TIME CMD

49116 ttys000    0:00.14 -bash

79773 ttys000    2:16.74 emacs

49772 ttys001    0:01.80 -bash

63245 ttys002    0:00.56 -bash

65082 ttys002    0:00.01 /bin/bash
/Applications/ChemAxon/MarvinBeans/bin/msketch

65087 ttys002   18:31.79
/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/bin/java
chemaxon.marvin.Sketch

81099 ttys002    1:06.60 inkscape

On Wed, Nov 25, 2015 at 10:49 AM, Maneesh Yadav <maneeshkyadav@gmail.com> wrote:
> Weird.
>
> I patched glib2 this way (just overriding the macros (and removing
> semicolons on macro invocations...seemed to be the best way to deal
> with if statements that didn't wrap in curly braces...just realized my
> strings don't reflect "UN"LOCK/LOCK....not a big deal since line
> numbers are there...fixed for next time):
>
> #define LOCK_CONTEXT(context) g_mutex_lock (&context->mutex)
>
> #define LOCK_CONTEXT(context) {printf("MANEESH GLIB DEBUG: About to
> LOCK: %s, %d, %s\n", __FILE__, __LINE__, __FUNCTION__); g_mutex_lock
> (&context->mutex);}
>
> #define UNLOCK_CONTEXT(context) g_mutex_unlock (&context->mutex)
>
> #define UNLOCK_CONTEXT(context) {printf("MANEESH GLIB DEBUG: About to
> LOCK: %s, %d, %s\n", __FILE__, __LINE__, __FUNCTION__); g_mutex_unlock
> (&context->mutex);}
>
>
> Grabbing the output (emacs -Q > test.out) shows the stall mid print:
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3208, g_main_context_acquire
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3222, g_main_context_acquire
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3801, g_main_context_iterate
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3812, g_main_context_iterate
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3376, g_main_context_prepare
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3501, g_main_context_prepare
>
> MANEESH GLIB DEBUG
>
>
>
> gmain.c 3501 region:
>
> if (timeout)
>
>     {
>
>       *timeout = context->timeout;
>
>       if (*timeout != 0)
>
>         context->time_is_fresh = FALSE;
>
>     }
>
>
>
>   UNLOCK_CONTEXT (context)
>
>
>   return n_poll;
>
>
> Nothing terribly different from the lldb backtrace (for completeness):
>
>
> (lldb) thread backtrace all
>
> * thread #1: tid = 0x9bb3c, 0x00007fff8a861166
> libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
> 'com.apple.main-thread', stop reason = signal SIGSTOP
>
>   * frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 
> 10
>
>     frame #1: 0x00007fff853b5696
> libsystem_pthread.dylib`_pthread_mutex_lock + 480
>
>     frame #2: 0x0000000100a17b78 libglib-2.0.0.dylib`g_mutex_lock + 26
>
>     frame #3: 0x00000001009da551 libglib-2.0.0.dylib`g_main_context_acquire + 
> 78
>
>     frame #4: 0x000000010024fc47 emacs`xg_select + 231
>
>     frame #5: 0x0000000100225c3d emacs`wait_reading_process_output + 3757
>
>     frame #6: 0x0000000100008cb6 emacs`sit_for + 582
>
>     frame #7: 0x0000000100108f00 emacs`read_char + 4496
>
>     frame #8: 0x0000000100104edd emacs`read_key_sequence + 1757
>
>     frame #9: 0x0000000100103cec emacs`command_loop_1 + 1212
>
>     frame #10: 0x00000001001bf04e emacs`internal_condition_case + 382
>
>     frame #11: 0x000000010011ce09 emacs`command_loop_2 + 41
>
>     frame #12: 0x00000001001be696 emacs`internal_catch + 342
>
>     frame #13: 0x0000000100102ddb emacs`command_loop + 187
>
>     frame #14: 0x0000000100102c9f emacs`recursive_edit_1 + 127
>
>     frame #15: 0x0000000100102f87 emacs`Frecursive_edit + 327
>
>     frame #16: 0x0000000100100fd3 emacs`main + 4387
>
>     frame #17: 0x00007fff8707a5c9 libdyld.dylib`start + 1
>
>     frame #18: 0x00007fff8707a5c9 libdyld.dylib`start + 1
>
>
>   thread #2: tid = 0x9bb48, 0x00007fff8a8613fa
> libsystem_kernel.dylib`__select + 10, name = 'gmain'
>
>     frame #0: 0x00007fff8a8613fa libsystem_kernel.dylib`__select + 10
>
>     frame #1: 0x00000001009e8bed libglib-2.0.0.dylib`g_poll + 399
>
>     frame #2: 0x00000001009dd303
> libglib-2.0.0.dylib`g_main_context_iterate + 627
>
>     frame #3: 0x00000001009dd40e
> libglib-2.0.0.dylib`g_main_context_iteration + 104
>
>     frame #4: 0x00000001009de7c6 libglib-2.0.0.dylib`glib_worker_main + 53
>
>     frame #5: 0x00000001009fde09 libglib-2.0.0.dylib`g_thread_proxy + 90
>
>     frame #6: 0x00007fff853b805a libsystem_pthread.dylib`_pthread_body + 131
>
>     frame #7: 0x00007fff853b7fd7 libsystem_pthread.dylib`_pthread_start + 176
>
>     frame #8: 0x00007fff853b53ed libsystem_pthread.dylib`thread_start + 13
>
>
> Everything is terrible.
>
> On Tue, Nov 24, 2015 at 5:50 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>>
>>>> To go deeper, we may need to build a separate copy of glib and start
>>>> putting some print statements in to find out why there is lock contention.
>>>> Would you be up for that? I'd like to know if this is happening in
>>>> g_get_worker_context.
>>
>> I've read further, and since "static gsize initialised;" must initialize to
>> zero, it's for me to see how this code could be wrong just from reading it.
>>
>> I'd like to find every line of code in glib that calls LOCK_CONTEXT or
>> UNLOCK_CONTEXT, and print out:
>>
>>     Function, file, line, lock or unlock, pointer value of context
>>
>> That should help us narrow it down.
>>
>> John





reply via email to

[Prev in Thread] Current Thread [Next in Thread]