[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services
From: |
Carlo Zancanaro |
Subject: |
[bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services |
Date: |
Fri, 02 Mar 2018 21:13:53 +1100 |
User-agent: |
mu4e 1.0; emacs 25.3.1 |
Hey Ludo,
On Fri, Mar 02 2018, Ludovic Courtès wrote:
I am doing that. The problem is that when a service dies
(crashes, quits, etc.) the `respawn?` option cannot be honoured
because shepherd is not notified that the process has
terminated (because it never receives a SIGCHLD for the forked
pid). My patch polls for the processes we expect, to make up
for the lack of notification.
I see.
Actually, thinking more about it, we should be using
PR_SET_CHILD_SUBREAPER from prctl(2), which is designed exactly
for that.
Excellent! This is exactly the information that I needed. This is
what I've been looking for, but without enough knowledge to be
able to find it. Thanks!
So what about this plan:
1. Add FFI bindings in (shepherd system) for prctl(2). We
should arrange for it to throw to 'system-error when the
‘prctl’ symbol is missing, as is the case on GNU/Hurd.
Are we okay with having this just not work on GNU/Hurd (or kernels
older than 3.4, according to the prctl manpage)? We could fall
back to a polling approach if prctl isn't available? I don't
really like the idea of this working on some kernels but not
others, given that process supervision is one of the main jobs of
shepherd.
2. Use prctl/PR_SET_CHILD_SUBREAPER in ‘exec-command’. Here we
must ‘catch-system-error’ around that call to cater to
GNU/Hurd.
Why would we need to set it in exec-command? It looks like it
modifies the state of the calling process, which means we'd want
to set it in the shepherd service, not in each of the child
processes.
That would address the main issue without having to resort to
polling. Respawning will work only when #:pid-file is used
though, but that’s already an improvement.
Thoughts?
I'll try to get this working in the next few days. Hopefully
you'll see a patch from me soon.
Carlo
signature.asc
Description: PGP signature
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Carlo Zancanaro, 2018/03/01
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Ludovic Courtès, 2018/03/02
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services,
Carlo Zancanaro <=
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Ludovic Courtès, 2018/03/02
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Carlo Zancanaro, 2018/03/03
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Ludovic Courtès, 2018/03/03
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Carlo Zancanaro, 2018/03/03
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Ludovic Courtès, 2018/03/04
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Carlo Zancanaro, 2018/03/04
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Ludovic Courtès, 2018/03/04
- [bug#30637] [WIP] shepherd: Poll every 0.5s to find dead forked services, Carlo Zancanaro, 2018/03/04
- bug#30637: [WIP] shepherd: Poll every 0.5s to find dead forked services, Ludovic Courtès, 2018/03/05