[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [sysvinit-devel] sysvinit causing many reads?

From: Jesse Smith
Subject: Re: [sysvinit-devel] sysvinit causing many reads?
Date: Wed, 10 Jul 2019 17:31:39 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2

> It seems select can return ERESTARTNOHAND many times when a lot of child 
> processes exit.
> With strace I can observe longer periods of such situations. Maybe the loop 
> should relax a bit after 
> ERESTARTNOHAND and not try again immediately?

This looks like an interesting and, if I'm not mistaken, unusual
situation.  It does indeed look like init is regularly being interrupted
from checking its message pipe for new instructions and is pausing to
deal with child processes exiting, then immediately checking the pipe again.

I suppose we could put in some kind of delay before checking the pipe,
but I'm not sure how effective it will be, for a few reasons. One being
that calls to delay, like sleep(), will also be interrupted by signals
from terminating child processes. Meaning the delay would get
interrupted much like the select() call would.

Another factor we might want to consider here is if we do introduce a
delay, will zombie child processes pile up? It sounds like a lot of
child processes are being reaped and delaying the queue might not be
ideal here.

A side issue I think we might want to look into here is why init is
receiving so many child termination signals? Usually init only reaps
child processes if the parent is already terminated. Ideally this is
rare. Which makes me wonder if something may be going wrong with a
parent/child pairing elsewhere on the system that is effectively
spamming init with signals it wouldn't receive under normal conditions.

What I'm suggesting is it might be possible another program is
misbehaving and it is causing init to be put under an unusual load of
signals. Which is, in turn, causing more reads on the pipe. This may be
something we need to fix, but I want to make sure we're addressing the
problem and not just covering it up.

Let's try to answer two key sets of questions:

1. Any idea what program is causing all the child signals coming to
init? And why they aren't being collected by the parent?

2. Will putting in a delay upon receiving an interrupt, if we can do so,
make this situation better? Or will we just end up handling larger
batches of child terminations?

I'm open to thoughts and suggestions.

- Jesse

reply via email to

[Prev in Thread] Current Thread [Next in Thread]