bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#12820: FWIW, this is still happening as of gnulib 4a82904


From: Nix
Subject: bug#12820: FWIW, this is still happening as of gnulib 4a82904
Date: Wed, 27 Feb 2013 18:25:44 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)

On 26 Feb 2013, Paul Eggert outgrape:

> On 02/26/13 13:48, Nix wrote:
>
>> #2  0x00000000004021e0 in test_utimens (address@hidden, func=0x401890 
>> <do_utimensat>) at test-utimens.h:35
>
> If that line number is right, the test program
> did a creat (file, 0600) that succeeded, followed
> by a stat (file, &st) that failed.

Looks like it.

Running test-utimensat in a tight loop shows frequent assertion
failures: running it under high load (e.g. while a 'make -j 9 check' is
running) shows fewer:

No load:

% START=$(date +%s); while [[ $(($(date +%s)-$START)) -lt 10 ]]; do 
./test-utimensat; done
test-lutimens.h:162: assertion failed
test-lutimens.h:162: assertion failed
test-utimens.h:103: assertion failed
test-utimens.h:41: assertion failed
test-lutimens.h:162: assertion failed

Parallel make -j 9 check:

% START=$(date +%s); while [[ $(($(date +%s)-$START)) -lt 10 ]]; do 
./test-utimensat; done
test-utimens.h:113: assertion failed
test-utimens.h:37: assertion failed
test-utimens.h:130: assertion failed
test-utimens.h:113: assertion failed
test-utimens.h:103: assertion failed
test-lutimens.h:64: assertion failed
test-utimens.h:128: assertion failed
test-utimens.h:103: assertion failed
test-lutimens.h:64: assertion failed

I note that this is not exactly healthy hardware: it is has a slowly
failing fan and is going in for repair tomorrow. However it isn't
overheating yet and *did* just complete a GCC LTO bootstrap-and-test
with no FAILs I wasn't expecting so I don't think the fan failure is
likely to be the cause of these assertions.

As is typical of timing-related faults, running it under strace makes
the frequency of the problems plunge. I've stuck a tarball of all the
distinct failing straces I could get in a few hours of tight loops under
strace up at
<http://www.esperi.org.uk/~nix/bugs/gnulib/utimensat.tar.gz>. (I
truncated the execve lines because dumping my entire process environment
out in public is not something I'm entirely happy doing. Paranoia,
maybe. :) )


Classifying the small set of assertion failures above (a set small
enough that I felt I could attack it without dying of boredom), we see:

test-utimens.h:37: assertion failed
  ASSERT (func (BASE "file", NULL) == 0);

test-utimens.h:113: assertion failed
  ASSERT (st3.st_atime == Y2K);

test-utimens.h:128: assertion failed
  ASSERT (get_stat_mtime_ns (&st3) == get_stat_mtime_ns (&st2));

test-lutimens.h:64: assertion failed
test-lutimens.h:162: assertion failed
test-utimens.h:41: assertion failed
test-utimens.h:103: assertion failed
test-utimens.h:130: assertion failed
  ASSERT (ctime_compare (&st1, &st2) < 0);
  ASSERT (ctime_compare (&st2, &st3) < 0);

I thought it was worth my analyzing at least the largest set of
failures, this one. This is simply a race. When these assertions are
hit, I see things like this:

st1.st_ctime: 1361977729; st2.st_ctime: 1361977729; compare: 0; 
get_stat_ctime_ns(st1): 613474299, get_stat_ctime_ns(st2): 613474299

i.e. the nsec values are identical. This is surely valid: it just means
the two stats were carried out in the same 1ns interval. All these
asserts should be <= 0. (I just saw an identical failure in
test-lchown.h:112.)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]