bug-dejagnu
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in


From: Jacob Bachmeyer
Subject: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case
Date: Thu, 25 Jun 2020 17:39:58 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.22) Gecko/20090807 MultiZilla/1.8.3.4e SeaMonkey/1.1.17 Mnenhy/0.7.6.0

Pedro Alves wrote:
On 6/25/20 3:43 AM, Jacob Bachmeyer wrote:
Pedro Alves wrote:
What that misses is that in some environments, particularly with slow embedded
system boards, running the full testsuite can take _days_.   This is all
compounded by the fact that the testsuite must be run in a large set of
different modes and boards to cover everything for a single target.  Even on
a modern GNU/Linux system, running all combinations of modes for full coverage
can take hours.

Having the testrun stop midway because of some specific testcase
blowing up can waste a lot of time.

The fact that the testsuite can take a while to fully run results in automated
testing systems, like GDB's buildbot, struggling to keep up with testing
all the commits that land in git.  On some buildbot setups, it's just not
possible to test all commits.  So if some testcase blows up the whole
test run, the next time you run the testsuite, it'll likely be that a
large number of unrelated commits went into the sources.  Losing a large
chunk of the testsuite results because of some small typo in one
testcase is thus undesirable, because noticing the typo in the first
place can take a while, and also you may be stuck with a "last good run"
for many of the testcases that gets older and older until
the typo is finally fixed.
I presume buildbot can be easily configured to pass a '--keep_going' option, 
perhaps as 'RUNTESTFLAGS=--keep_going' if needed?

What's more likely is we'll tweak GDB's Makefile so that make check
enables that option automatically.  If GCC does the same, which I would
encourage them to, since they also run the testsuite against systems
that can take hours/days to complete, then I'll get to wonder who the
default is for.  :-)

The default is for developers working on testsuites -- and for testing release candidates and releases because if your testcases crash in a release, you have problems. :-)

Are you suggesting withdrawing the --keep-going/--no-keep-going options and always generating an UNRESOLVED result, resuming with the next test script, and possibly storing a list of aborted test scripts somewhere to repeat at the very end of a test run with a big warning about test cases that crashed?

Still, needing a new option to get old behavior is changing the environment 
interface, so I will make --keep_going the default, at least for the rest of 
1.6.  That default is likely to change for 1.7.  For automated systems like 
buildbot, passing '--keep_going' should be perfectly reasonable, which is why 
the option exists.  (While the documented options will be changed to use 
hyphens to align with the rest of the GNU system in 1.7, the current option 
names will continue to be accepted, since the aliases are obvious.)

Currently with DejaGnu only the case of calling an unknown function stops
the test run AFAIK.  Any other tcl error like calling a proc with the wrong
number of arguments, or treating a non-array variable as an array, etc. is
caught by the catch in the runtest proc in runtest.exp, and the testrun
continues.  That to me clearly indicates that the original intention was
to catch errors and continue.

That also gives the tool's $(tool)_finish proc a chance to tear down
correctly.

It is just that the "unknown" case wasn't thought of.  So I argue that not
making unknown proc calls abort the run is a bug fix, and making DejaGnu
abort the run for other errors by default is a behavior change that no real
testsuite built around DejaGnu is asking for.

As it was explained to me, it is the other way around -- the catch in runtest was overlooked when ::unknown was added to catch bugs by aborting if an undefined procedure is called.

Is there a ${tool}_finish or do you mean ${tool}_exit? Either way, you make a good point about running cleanup procedures when aborting a test run.

The entire reason for stopping the testsuite immediately when a Tcl error 
occurs is that it is undeniable that something is *very* *wrong* when the 
testsuite stops early.  Software testing is hard enough when you know your 
testsuite is good, but it is far worse when you have bugs in the testsuite 
masking bugs in the actual program because a test script is crashing and part 
of the program that is supposed to be tested is not being exercised.

I think it's safe to say that we all understand the general principle,
but IMO that applies more to continuing the same testcase (if somehow that
were possible, like "unknown" suppressing the error), rather than continuing
with a different testcase.

Continuing the testrun when one testcase errors out does not mask any
bug in the program that is supposed to be tested.  Why would it?  The
testcase aborts, it doesn't continue.  The remaining testcases start
afresh.

Consider a hypothetical case where GDB has exec.exp (which tests starting a debugged process) and attach.exp (which tests attaching to a process). Unknown to the GDB developers, the "attach" command does not work and causes GDB to dump core (oops!) but attach.exp fails with a Tcl error before reaching the point where "attach" is issued to the inferior GDB. If DejaGnu stops immediately, it is obvious that the testsuite is broken, the bug in the testsuite is fixed, and the bug in GDB is found. If DejaGnu sweeps the Tcl error under the proverbial rug (as previous versions did), the bug in the testsuite effectively hides the bug in GDB , because the developers think the "attach" command is being exercised, but those tests are being (almost) silently skipped.

Now a bug that obvious will be caught very quickly by users (likely the GDB developers themselves) but obscure regressions are for more insidious, especially in large testsuites such as those for GCC or GDB. If a regression test is being skipped with an easily-missed error, that regression can slip in unnoticed.

Would simply turning aborted test scripts into UNRESOLVED results be enough to get them fixed quickly?


-- Jacob





reply via email to

[Prev in Thread] Current Thread [Next in Thread]