qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 2/2] tests/Makefile: comment out flakey test


From: Thomas Huth
Subject: Re: [Qemu-devel] [RFC PATCH 2/2] tests/Makefile: comment out flakey tests
Date: Tue, 22 May 2018 08:01:05 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0

On 19.05.2018 13:36, Peter Maydell wrote:
> On 19 May 2018 at 07:10, Thomas Huth <address@hidden> wrote:
>> On 18.05.2018 20:31, Peter Maydell wrote:
>>> Another flaky test for the collection:
>>>
>>> TEST: tests/boot-serial-test... (pid=25144)
>>>   /sparc64/boot-serial/sun4u:                                          **
>>> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/boot-serial-test.c:140:check_guest_output:
>>> assertion failed: (output_ok)
>>> FAIL
>>>
>>> Probably another "overly optimistic timeout" setting. (Failed
>>> for me on x86-64 host just now.)
>>
>> That test normally finishes within 3 seconds on my machine. The test
>> timeout is 60 seconds. How much load did you have on that machine to go
>> from 3s to 60s ?
> 
> The machine is my desktop box; I didn't notice anything too
> terrible while I was using it interactively at the same time
> the test build was running. The test build will run at -j8;
> it might also have been during a different -j8 build/test
> on the same machine for a different source tree.

That does not sound like it could cause a test time increase from 3s
to more than 60s. Maybe from 3s to 10s or 20s, but to more than 60s?

> 60s is quite a long time, so maybe there's an intermittent
> deadlock in there instead...

I just had a look through my mails, and the last (and as far as I
remember only) time we've seen an unexplainable error with the boot
serial tester was here:

https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg01057.html

That was also related to sparc, though it was 32-bit sparc, not 64-bit
sparc. Could it still be related?

Anyway, no clue how to properly debug this ... so far I was not able to
reproduce this on my laptop here. I could think of the following options:

1) Increase the test timeout from 60s to maybe 90s or 120s.

2) Add an option to run tests without timeout (i.e. infinite timeout)

3) What could really be helpful for debugging: Move the
"unlink(serialtmp);" in the test to the end of the function, so that the
output file should not be get deleted when the test aborts unexpectedly.

4) If it's really just the sparc tests that are failing, we could run
them in the SPEED=slow mode only, so that they do not break the normal
integration tests. Not sure whether we are confident enough for that
yet, though.

What do you think?

 Thomas



reply via email to

[Prev in Thread] Current Thread [Next in Thread]