Hi,
On 09/26/2016 02:13 PM, Eduardo Habkost wrote:
On Sun, Sep 25, 2016 at 04:55:53PM -0400, Marc-André Lureau wrote:
Hi
----- Original Message -----
This time with Marc-André in cc:...
On 09/23/2016 07:40 PM, Maxime Coquelin wrote:
On 09/23/2016 05:41 PM, Michael S. Tsirkin wrote:
On Fri, Sep 23, 2016 at 12:36:12PM -0300, Eduardo Habkost wrote:
Hi,
I hit a weird vhost-user-test failure on travis-ci recently, on a
branch where I didn't touch any vhost-related code. From a quick
look at the code, it looks like the vhost-user code is unhappy to
see a disconnected socket.
I wasn't able to reproduce it. It seems to be a hard to reproduce
race between vhost-user code and socket reconnection.
The failure can be seen at:
https://travis-ci.org/ehabkost/qemu-hacks/jobs/162077239
Maxime looked at something similiar. Any idea?
No, not really.
Marc-André contributed a lot to these tests, I add him in cc: in case
he has an idea.
I will have a look in the mean time.
I am unable to reproduce locally (over 500x iterations), and I
have no clue what's going on: the warnings there aren't the
problem (that's the main reason why we use the subprocess, to
silence those). Do you have a local reproducer or is it only on
travis? Afaik, there are no other reports of this test failing,
are you sure its not related to changes on your branch?
I don't have a local reproducer, I could only see it once on
travis-ci. Maybe it is not possible to reproduce it if the
machine isn't loaded enough to make the right thread/process be
delayed.
I'm also trying to reproduce it.
Interestingly, launching the test with strace, I reproduce another
problem systematically:
$> strace -o /tmp/vut -ff ./tests/vhost-user-test
/x86_64/vhost-user/read-guest-mem: OK
/x86_64/vhost-user/migrate: Vhost user backend fails to broadcast fake RARP
OK
/x86_64/vhost-user/reconnect: OK
I'll try to load the CPU randomly when executing the test.