bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27


From: Andy Wingo
Subject: bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27
Date: Thu, 05 Jul 2018 10:00:52 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)

Hi!

On Thu 05 Jul 2018 05:33, Mark H Weaver <address@hidden> writes:

>> One problem I’ve noticed is that the child process that
>> ‘call-with-decompressed-port’ spawns would be stuck trying to get the
>> allocation lock:
>>
>> So it seems quite clear that the thing has the alloc lock taken.  I
>> suppose this can happen if one of the libgc threads runs right when we
>> call fork and takes the alloc lock, right?
>
> Does libgc spawn threads that run concurrently with user threads?  If
> so, that would be news to me.  My understanding was that incremental
> marking occurs within GC allocation calls, and marking threads are only
> spawned after all user threads have been stopped, but I could be wrong.

I think Mark is correct.

> The first idea that comes to my mind is that perhaps the finalization
> thread is holding the GC allocation lock when 'fork' is called.

So of course we agree you're only supposed to "fork" when there are no
other threads running, I think.

As far as the finalizer thread goes, "primitive-fork" calls
"scm_i_finalizer_pre_fork" which should join the finalizer thread,
before the fork.  There could be a bug obviously but the intention is
for Guile to shut down its internal threads.  Here's the body of
primitive-fork fwiw:

    {
      int pid;
      scm_i_finalizer_pre_fork ();
      if (scm_ilength (scm_all_threads ()) != 1)
        /* Other threads may be holding on to resources that Guile needs --
           it is not safe to permit one thread to fork while others are
           running.
    
           In addition, POSIX clearly specifies that if a multi-threaded
           program forks, the child must only call functions that are
           async-signal-safe.  We can't guarantee that in general.  The best
           we can do is to allow forking only very early, before any call to
           sigaction spawns the signal-handling thread.  */
        scm_display
          (scm_from_latin1_string
           ("warning: call to primitive-fork while multiple threads are 
running;\n"
            "         further behavior unspecified.  See \"Processes\" in the\n"
            "         manual, for more information.\n"),
           scm_current_warning_port ());
      pid = fork ();
      if (pid == -1)
        SCM_SYSERROR;
      return scm_from_int (pid);
    }

> Another possibility: both the finalization thread and the signal
> delivery thread call 'scm_without_guile', which calls 'GC_do_blocking',
> which also temporarily grabs the GC allocation lock before calling the
> specified function.  See 'GC_do_blocking_inner' in pthread_support.c in
> libgc.  You spawn the signal delivery thread by calling 'sigaction' and
> you make work for it to do every second when the SIGALRM is delivered.

The signal thread is a possibility though in that case you'd get a
warning; the signal-handling thread appears in scm_all_threads.  Do you
see a warning?  If you do, that is a problem :)

>> If that is correct, the fix would be to call fork within
>> ‘GC_call_with_alloc_lock’.
>>
>> How does that sound?
>
> Sure, sounds good to me.

I don't think this is necessary.  I think the problem is that other
threads are running.  If we solve that, then we solve this issue; if we
don't solve that, we don't know what else those threads are doing, so we
don't know what mutexes and other state they might have.

Andy





reply via email to

[Prev in Thread] Current Thread [Next in Thread]