qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iotest failure -- test possibly not using sufficiently unique temp f


From: Max Reitz
Subject: Re: iotest failure -- test possibly not using sufficiently unique temp filename?
Date: Fri, 18 Oct 2019 10:42:30 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0

On 18.10.19 08:20, Thomas Huth wrote:
> On 17/10/2019 18.41, Peter Maydell wrote:
>> On Fri, 27 Sep 2019 at 17:44, Max Reitz <address@hidden> wrote:
>>>
>>> On 27.09.19 18:39, Peter Maydell wrote:
>>>> Hi; I just saw this iotest failure (on an s390x box, as it happens):
>>>>
>>>>   TEST    iotest-qcow2: 130 [fail]
>>>> QEMU          --
>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
>>>> -nodefaults -display none -machine accel=qtest
>>>> QEMU_IMG      -- 
>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
>>>> QEMU_IO       --
>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
>>>> --cache writeback -f qcow2
>>>> QEMU_NBD      -- 
>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
>>>> IMGFMT        -- qcow2 (compat=1.1)
>>>> IMGPROTO      -- file
>>>> PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
>>>> TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
>>>> SOCKET_SCM_HELPER --
>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper
>>>>
>>>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>>>> 12:27:16.948075733 -0400
>>>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>>>> 2019-09-27 12:01:23.649722655 -0400
>>>> @@ -18,20 +18,22 @@
>>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>>  (qemu) commit testdisk
>>>>  (qemu)
>>>> -image: TEST_DIR/t.IMGFMT
>>>> -file format: IMGFMT
>>>> -virtual size: 64 MiB (67108864 bytes)
>>>> -backing file: TEST_DIR/t.IMGFMT.orig
>>>> -backing file format: raw
>>>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared 
>>>> "write" lock
>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>
>>>>  === Marking image dirty (lazy refcounts) ===
>>>>
>>>> +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>> -wrote 4096/4096 bytes at offset 0
>>>> -4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>> +qemu-io: can't open device
>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
>>>> to get "write" lock
>>>> +Is another process using the image
>>>> [/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
>>>> +no file open, try 'help open'
>>>>  image: TEST_DIR/t.IMGFMT
>>>>  file format: IMGFMT
>>>>  virtual size: 64 MiB (67108864 bytes)
>>>> +backing file: TEST_DIR/t.IMGFMT.orig
>>>> +backing file format: raw
>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>> backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
>>>>  wrote 4096/4096 bytes at offset 0
>>>>  4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>>
>>>>
>>>>
>>>> This looks suspiciously like the test isn't using a unique
>>>> filename for its disk image: "qemu-iotests/scratch/t.qcow2"
>>>> in the build directory, and so perhaps it has collided with
>>>> another iotest ?
>>>>
>>>> If we run 'make check' with a -j<something> option do the
>>>> iotests all get run serially anyway, or do they run in
>>>> parallel against each other ?
>>>
>>> As far as I know, all iotests are executed serially.  Anything else
>>> would not work with the same scratch directory.
>>>
>>> The only thing I suspect is that some tool has been accidentally left
>>> running by some previous test that still accesses its own image.  But I
>>> don’t know.
>>
>> Just saw this one again with the same iotest 130 on the same
>> s390 box; only difference is that the log this time around
>> has the first part where qemu-img fails, but not the second part
>> where qemu-io fails:
>>
>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>> 12:27:16.948075733 -0400
>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>> 2019-10-17 11:56:43.450750873 -0400
>> @@ -18,11 +18,8 @@
>>  QEMU X.Y.Z monitor - type 'help' for more information
>>  (qemu) commit testdisk
>>  (qemu)
>> -image: TEST_DIR/t.IMGFMT
>> -file format: IMGFMT
>> -virtual size: 64 MiB (67108864 bytes)
>> -backing file: TEST_DIR/t.IMGFMT.orig
>> -backing file format: raw
>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" 
>> lock
>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>
>>  === Marking image dirty (lazy refcounts) ===
>>
>> On the host machine there don't seem to be any stray
>> processes which might have held the file open, and
>> indeed the file doesn't exist at all, so it got removed
>> by some cleanup or other.
> 
> Ok, so unless someone has a clue what might be going on here (is there a
> race in the test?), I'd suggest that we simply remove 130 from the auto
> group again. Shall I send a patch?

I don’t have much of an idea.  It looks like maybe the qemu process
(which dos the commit) is lingering, but that shouldn’t be because
_cleanup_qemu always waits for it.  (Also, I can’t reproduce the problem
on my system.)

The only hunch that I have is that 130 seems to be the only test that
uses _cleanup_qemu to kill a qemu process (i.e. without wait=1) while it
has taken locks on an image and then still uses the image afterwards.
Maybe making it quit qemu through HMP would fix the problem.  But
knowing is difficult because I can’t reproduce it.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]