qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: qemu iotest 161 and make check


From: Christian Borntraeger
Subject: Re: qemu iotest 161 and make check
Date: Thu, 27 Oct 2022 07:54:44 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0



Am 31.03.22 um 10:25 schrieb Christian Borntraeger:


Am 31.03.22 um 09:44 schrieb Christian Borntraeger:


Am 21.02.22 um 11:27 schrieb Christian Borntraeger:

Am 10.02.22 um 18:44 schrieb Vladimir Sementsov-Ogievskiy:
10.02.2022 20:13, Thomas Huth wrote:
On 10/02/2022 15.51, Christian Borntraeger wrote:


Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
10.02.2022 10:57, Christian Borntraeger wrote:
Hello,

I do see spurious failures of 161 in our CI, but only when I use
make check with parallelism (-j).
I have not yet figured out which other testcase could interfere

@@ -34,6 +34,8 @@
  *** Commit and then change an option on the backing file

  Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
+qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock

FWIW, qemu_lock_fd_test returns -11 (EAGAIN)
and raw_check_lock_bytes spits this error.


And its coming from here (ret is 0)

int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive)
{
     int ret;
     struct flock fl = {
         .l_whence = SEEK_SET,
         .l_start  = start,
         .l_len    = len,
         .l_type   = exclusive ? F_WRLCK : F_RDLCK,
     };
     qemu_probe_lock_ops();
     ret = fcntl(fd, fcntl_op_getlk, &fl);
     if (ret == -1) {
         return -errno;
     } else {
----->        return fl.l_type == F_UNLCK ? 0 : -EAGAIN;
     }
}



Is this just some overload situation that we do not recover because we do not 
handle EAGAIN any special.

Restarted my investigation. Looks like the file lock from qemu is not fully 
cleaned up when the process is gone.
Something like
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..b28a6c187c 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -403,4 +403,5 @@ _cleanup_qemu()
         unset QEMU_IN[$i]
         unset QEMU_OUT[$i]
     done
+    sleep 0.5
 }


makes the problem go away.

Looks like we do use the OFD variant of the file lock, so any clone, fork etc 
will keep the lock.

So I tested the following:

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..01bdb05575 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -388,7 +388,7 @@ _cleanup_qemu()
                 kill -KILL ${QEMU_PID} 2>/dev/null
             fi
             if [ -n "${QEMU_PID}" ]; then
-                wait ${QEMU_PID} 2>/dev/null # silent kill
+                wait 2>/dev/null # silent kill
             fi
         fi


And this also helps. Still trying to find out what clone/fork happens here.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]