automake-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cond5.test spurious failure


From: Peter Rosin
Subject: Re: cond5.test spurious failure
Date: Fri, 06 Aug 2010 15:36:55 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.7) Gecko/20100713 Thunderbird/3.1.1

Den 2010-08-06 15:15 skrev Stefano Lattarini:
At Friday 06 August 2010, Peter Rosin wrote:
Hi Stefano,

Den 2010-08-06 13:54 skrev Stefano Lattarini:
At Friday 06 August 2010, Peter Rosin wrote:
[CUT]
I was not running in parallel and I was running off of the msvc
branch (basically maint I think) plus the commit in the previous
message so I am up to date.
Ouch. Bad news.
I had a look at the cond5.log file
(but it is now gone) and from what I could see the
     if kill -0 $pid; then
failed on the first attempt (which I found odd)
That could be possible, it just means that the automake run completed
blazingly fast.  Still, that would have been really, really blazingly
fast... hmm...

Nope, not that fast. As it happens, kill fails even if the process
still lives. See below.

and then the following grep in the else statement must have come up
empty.
This is even more odd IMO.
But when I looked at the stderr file I think it had the expected
content so the grep should have been ok.
And this is even more odd.

The stderr content gets there eventually, but it is not there when
the grep runs. See below.


Maybe the process was gone (kill failed) but the stderr content was
not yet flushed or something?
Is that even possible?  IIUC, all the I/O of a process should be
flushed automatically if process terminates "normally" (like in this
case)...

Probably a red herring.

This is on MSYS/MinGW so don't expect full POSIX adherence...
Oh, I really know nothing about MSYS/MinGW, so that could be a
possibility.  Any expert of MSYS/MinGW can confirm or deny?

Hmmm, maybe I should just try to run the test again? Ok, below is
the output from a failed run (it failed directly). Maybe we should
add an extra "sleep $arbitrary" before the "cat stderr"?
Do you think it would really solve the problem, or would it just
reduce the likeliness of the race condition?

Nope, see below.

=== Running test ./cond5.test
++ pwd
/home/peda/automake/git/automake/tests/cond5.dir
+ set -e
+ cat
+ cat
+ aclocal-1.11 -Werror
+ pid=7240
+ automake-1.11 --foreign -Werror -Wall
Wait wait, these last two lines seem swapped... is this a minor wart,
or the symptom of a real problem?

They are in the same order when the test succeeds.

+ try=1
+ test 1 -le 30
+ kill -0 7240
./cond5.test: line 54: kill: (7240) - No such process
+ cat stderr
And this defintely shouldn't be empty...
+ grep 'variable.*OPT_SRC.*recursively defined' stderr
+ exit_status=1
+ set +e
+ cd /home/peda/automake/git/automake/tests
+ case $exit_status,$keep_testdirs in
+ test 0 '!=' 0
+ echo 'cond5: exit 1'
cond5: exit 1
+ exit 1

To help us pinpoint the problem, could you please try to run the
attached "fake" test scripts multiple times, and post the outputs?
Please forgive me the for the ugliness of such scripts, but I'm just
taking wild guesses here.

I was busy testing if the sleep worked and was just about to send a
message when I saw this, so I'm replying here instead of responding
to self. Haven't tested your scripts yet, but that might not be needed.



Nope, the sleep didn't help, I still get about 50% failure
rate with a "sleep 2" in there. It must be something else.

However, I added a "jobs" instead of the "sleep 2", and
interestingly the process was still alive even if the
"kill -0 $pid" failed, and I can also note that it takes
a while before the stderr file gets the correct content
after the failure.

Also doing a "ps" reveals that the recorded pid appears correct.

So, this seems like a bug in the kill builtin in MSYS bash.

Cheers,
Peter


$ bash --version
GNU bash, version 3.1.0(1)-release (i686-pc-msys)
Copyright (C) 2005 Free Software Foundation, Inc.


=== Running test ./cond5.test
++ pwd
/home/peda/automake/git/automake/tests/cond5.dir
+ set -e
+ cat
+ cat
+ aclocal-1.11 -Werror
+ pid=7576
+ automake-1.11 --foreign -Werror -Wall
+ try=1
+ test 1 -le 30
+ kill -0 7576
./cond5.test: line 54: kill: (7576) - No such process
+ jobs
[1]+  Running                 $AUTOMAKE 2>stderr &
+ cat stderr
+ grep 'variable.*OPT_SRC.*recursively defined' stderr
+ exit_status=1
+ set +e
+ cd /home/peda/automake/git/automake/tests
+ case $exit_status,$keep_testdirs in
+ test 0 '!=' 0
+ echo 'cond5: exit 1'
cond5: exit 1
+ exit 1



reply via email to

[Prev in Thread] Current Thread [Next in Thread]