[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sort link failure due to sleep
Re: sort link failure due to sleep
Tue, 24 Nov 2009 08:26:22 +0100
Eric Blake wrote:
> Jim Meyering <jim <at> meyering.net> writes:
>> > Or should I go with an alternative patch of making sort.c use xnanosleep
>> > instead of sleep? (For that matter, if we went with xnanosleep, we could
>> > sort's child wait loop start with something smaller, like 0.25 seconds,
>> > of a minimum granularity of 1 second.)
>> Hi Eric,
>> Your alternative patch sounds better still.
> How about this? Note there is a slight change in semantics to forking done by
> sort - I chose to cut the minimum wait time by a factor of 4, but increase the
> number of retries only by 2, for a net result that we give up twice as fast.
Actually, adding 2 (as you did for MAX_FORK_TRIES_COMPRESS) happens to be
perfect, since the retry delay is doubled at each iteration. Cutting the
initial delay by 4 is matched by doubling the delay two more times.
However, for MAX_FORK_TRIES_DECOMPRESS, you should also merely add 2,
and not double it; there is a big difference between waiting 2^8 and 2^14
seconds (~4 min and 4.5 hours) for the final iteration. As I imagine
being in the unusual situation where I'd see fork failure, I think when
the delay is larger than a second or two, sort should announce on stderr
what it is doing. Otherwise, the poor user will think sort has simply hung.
Even then, a delay of four minutes seems excessive, so maybe add only 1 or
Please update the log not to say "double", too:
(MAX_FORK_TRIES_COMPRESS, MAX_FORK_TRIES_DECOMPRESS): Add 2,
to counteract the division by 4 in the length of the initial delay.
Thanks for the cfg.mk addition.
> Is this discrepancy okay, or do I still need to tweak the scale of one (or
> both) of the changes? There is also the potential of a subtle change of
> behavior of how sort reacts if invoked with SIGALRM blocked or ignored, since
> POSIX requires that nanosleep work independently of SIGALRM, while sleep may
> hampered by interference with the SIGALRM signal (although for really old
> platforms that lack nanosleep, the gnulib xnanosleep fallback calls usleep or
> even sleep, so we can't quite guarantee no SIGALRM interference).
This sounds like an acceptable risk, since it will be rare,
and even then, will affect only platforms we don't care about.
> From: Eric Blake <address@hidden>
> Date: Mon, 23 Nov 2009 14:59:18 -0700
> Subject: [PATCH] build: fix link failure on cygwin
> Cygwin 1.5 has a broken sleep, and the gnulib tests dragged in
> rpl_sleep which then caused a link failure because it wasn't in
> libcoreutils.a. We could solve it by using the gnulib sleep module.
> However, sleep and usleep may interact poorly with SIGALRM, and they
> have less granularity; so it is better to adopt a policy that if we
> must sleep, prefer xnanosleep.
> * src/sort.c (pipe_fork): Use xnanosleep, to avoid the need for
> rpl_sleep on cygwin, and to reduce granularity.
> (MAX_FORK_TRIES_COMPRESS, MAX_FORK_TRIES_DECOMPRESS): Double,
> to account for reduction in granularity.
> * src/tail.c (tail_file): Use xnanosleep in debug code.
> * cfg.mk (sc_prohibit_sleep): New rule.
> - MAX_FORK_TRIES_COMPRESS = 2,
> + MAX_FORK_TRIES_COMPRESS = 4,
> - MAX_FORK_TRIES_DECOMPRESS = 8
> + MAX_FORK_TRIES_DECOMPRESS = 16
> /* The representation of the decimal point in the current locale. */
> @@ -868,7 +869,7 @@ pipe_fork (int pipefds, size_t tries)
> #if HAVE_WORKING_FORK
> struct tempnode *saved_temphead;
> int saved_errno;
> - unsigned int wait_retry = 1;
> + double wait_retry = 0.25;