coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [coreutils] [PATCH] sort: fix some --compress reaper bugs


From: Jim Meyering
Subject: Re: [coreutils] [PATCH] sort: fix some --compress reaper bugs
Date: Tue, 14 Dec 2010 10:55:20 +0100

Paul Eggert wrote:
> I found and fixed some bugs and simplified some code
> while trying (and so far, failing) to fix the hang reported at the end of
> <http://lists.gnu.org/archive/html/bug-coreutils/2010-12/msg00023.html>.
> I installed the following, and plan to look into that hang some more.
>
> * src/sort.c (uintptr): New type.
> (enum procstate, struct procnode, update_proc): Remove.
> (proctab_hasher, proctab_comparator, register_proc, wait_proc):
> (reap_some): The proctab is now simply a hash of process-IDs
> rather than of pointers to objects with reference counts and
> states; this is smaller and faster and easier to understand.
> (nprocs): Now pid_t, not size_t, since one cannot have more than
> PID_MAX children.
> (reap): If the argument is -1, wait; if 0 (a new value), do not.
> Delete pid from proctab as needed.  Ignore children that are not
> in proctab, as they are from the program that exec'ed us and are
> irrelevant to our success or failure.
> (delete_proc, reap_all): New functions.
> (open_temp): Register the child.
> (sort): Clean up all children afterwards; without this patch,
> 'sort' sometimes missed failures in children due to race conditions.
> * tests/Makefile.am (TESTS): Add misc/sort-compress-proc.
> * tests/misc/sort-compress-proc: New file, to test for the
> bugs fixed above.

Nice clean-up and fix.  Thanks for the patch.
I'm glad to know that you're still on it.

...
> diff --git a/tests/misc/sort-compress-proc b/tests/misc/sort-compress-proc
...
> +. "${srcdir=.}/init.sh"; path_prepend_ ../src
> +print_ver_ sort
> +expensive_
> +
> +# Ensure that $TMPDIR is valid.
> +if test -d /tmp && test -w /tmp
> +then TMPDIR=/tmp
> +else TMPDIR=.
> +fi
> +export TMPDIR

You may remove the TMPDIR-manipulating code above since tests/check.mk
already ensures that TMPDIR is a directory:

  TESTS_ENVIRONMENT =                             \
    . $(srcdir)/lang-default;                     \
    tmp__=$$TMPDIR; test -d "$$tmp__" || tmp__=.; \
    . $(srcdir)/envvar-check;                     \
    TMPDIR=$$tmp__; export TMPDIR;                \

But as you can see, it does not ensure it's writable, while your
code did, so moving your conjunct to check.mk would be an improvement.

...

> +# Give compression subprocesses time to react when 'sort' exits.
> +# Otherwise, under NFS, when 'sort' unlinks the temp files and they
> +# are renamed to .nfsXXXX instead of being removed, the parent cleanup
> +# of this directory will fail because the files are still open.
> +sleep 1

It'd be a shame to make everyone sleep even one second for NFS.
And on a heavily loaded system, 1 second might not be enough.
Can you formulate that as a loop that sleeps say 0.1 seconds
at a time for up to 5 seconds and that checks whether it's
safe to exit?

> +Exit $fail



reply via email to

[Prev in Thread] Current Thread [Next in Thread]