bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#63982: Shepherd can crash when a user service fails to start


From: Ludovic Courtès
Subject: bug#63982: Shepherd can crash when a user service fails to start
Date: Wed, 12 Jul 2023 19:46:56 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Hi!

Ludovic Courtès <ludo@gnu.org> skribis:

> Turns out that this happens when calling the ‘daemonize’ action on
> ‘root’.  I have a reproducer now and am investigating…

Good news: this is fixed in Shepherd commit
f4272d2f0f393d2aa3e9d76b36ab6aa5f2fc72c2!

The root cause is inconsistent semantics when mixing epoll, signalfd,
and fork, specifically this part from signalfd(2):

   epoll(7) semantics
       If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
       epoll(7) instance, then epoll_wait(2) returns events only  for  signals
       sent  to that process.  In particular, if the process then uses fork(2)
       to create a child process, then the child will be able to read(2)  sig‐
       nals  that  are  sent  to  it  using  the signalfd file descriptor, but
       epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
       ready.   In  this  scenario,  a  possible  workaround is that after the
       fork(2), the child process can close the signalfd file descriptor  that
       it  inherited  from the parent process and then create another signalfd
       file descriptor and add it to the epoll instance. […]

The C program below illustrates this behavior:

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/signal.h>
#include <sys/signalfd.h>
#include <sys/epoll.h>

int
main ()
{
  int ep, sfd;

  sigset_t signals;
  sigemptyset (&signals);
  sigaddset (&signals, SIGINT);
  sigaddset (&signals, SIGHUP);

  sigprocmask (SIG_BLOCK, &signals, NULL);
  sfd = signalfd (-1, &signals, SFD_CLOEXEC);

  ep = epoll_create1 (EPOLL_CLOEXEC);

  struct epoll_event events = { .events = EPOLLIN | EPOLLONESHOT, .data = NULL 
};
  epoll_ctl (ep, EPOLL_CTL_ADD, sfd, &events);

  epoll_wait (ep, &events, 1, 123);

  if (fork () == 0)
    {
      /* Quoth signalfd(2):

         If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
         epoll(7) instance, then epoll_wait(2) returns events only  for  signals
         sent  to that process.  In particular, if the process then uses fork(2)
         to create a child process, then the child will be able to read(2)  sig‐
         nals  that  are  sent  to  it  using  the signalfd file descriptor, but
         epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
         ready.   */

      printf ("try this: kill -INT %i\n", getpid ());
      while (1)
        {
          struct signalfd_siginfo info;
          if (epoll_wait (ep, &events, 1, 777) > 0)
            {
              read (sfd, &info, sizeof info);
              printf ("got signal %i!\n", info.ssi_signo);
              epoll_ctl (ep, EPOLL_CTL_MOD, sfd, &events);
            }
        }
    }

  return 0;
}
Of course it took me a while to find out about this; I first looked at
things individually and didn’t expect the mixture to behave
inconsistently.

Maxim, let me know if it works for you!

Thanks,
Ludo’.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]