Re: [Qemu-block] [PATCH] backup: block-job error BUG

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] backup: block-job error BUG

From:	John Snow
Subject:	Re: [Qemu-block] [PATCH] backup: block-job error BUG
Date:	Mon, 1 Aug 2016 18:36:14 -0400
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0



On 07/18/2016 05:22 PM, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

This is a variant of existing test case which produces test failure.

It looks like the reason is:

one block job is in backup_complete, in synchronous bdrv_flush (success job)
other (job with injected io err) tries to synchronously cancel "success job"
It looks like some kind of dead-lock

-..........
+........EEE
+======================================================================
+ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
+Test: Verify backups made from a transaction that partially fails.
+----------------------------------------------------------------------
+Traceback (most recent call last):
+  File "124", line 478, in test_transaction_failure
+    self.wait_qmp_backup_cancelled(drive0['id'])
+  File "124", line 173, in wait_qmp_backup_cancelled
+    match={'data': {'device': device}})
+  File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 308, in event_wait
+    event = self._qmp.pull_event(wait=timeout)
+  File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 194, 
in pull_event
+    self.__get_events(wait)
+  File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 109, 
in __get_events
+    raise QMPTimeoutError("Timeout waiting for event")
+QMPTimeoutError: Timeout waiting for event
+
+======================================================================
+ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
+Test: Verify backups made from a transaction that partially fails.
+----------------------------------------------------------------------
+Traceback (most recent call last):
+  File "124", line 272, in tearDown
+    self.vm.shutdown()
+  File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 260, in shutdown
+    self._qmp.cmd('quit')
+  File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 172, 
in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 157, 
in cmd_obj
+    return self.__json_read()
+  File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 66, 
in __json_read
+    data = self.__sockfile.readline()
+  File "/usr/lib64/python2.7/socket.py", line 447, in readline
+    data = self._sock.recv(self._rbufsize)
+timeout: timed out
+
+======================================================================
+ERROR: test_incremental_failure (__main__.TestIncrementalBackupBlkdebug)
+Test: Verify backups made after a failure are correct.
+----------------------------------------------------------------------
+Traceback (most recent call last):
+  File "124", line 549, in setUp
+    self.vm.launch()
+  File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 246, in launch
+    self._qmp = qmp.QEMUMonitorProtocol(self._monitor_path, server=True)
+  File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 44, 
in __init__
+    self.__sock.bind(self.__address)
+  File "/usr/lib64/python2.7/socket.py", line 224, in meth
+    return getattr(self._sock,name)(*args)
+error: [Errno 98] Address already in use
+

I guess this is a residual failure based off of our inability to cleanup properly after the real failure.

 ----------------------------------------------------------------------
 Ran 10 tests

-OK
+FAILED (errors=3)
Failures: 124

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
---
 tests/qemu-iotests/124 | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index de7cdbe..74de117 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -448,9 +448,9 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
         self.assertFalse(self.vm.get_qmp_events(wait=False))

         # Emulate some writes
-        self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
-                                          ('0xfe', '16M', '256k'),
-                                          ('0x64', '32736k', '64k')))
+        #self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+        #                                  ('0xfe', '16M', '256k'),
+        #                                  ('0x64', '32736k', '64k')))
         self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
                                           ('0xef', '16M', '256k'),
                                           ('0x46', '32736k', '64k')))


Python diffs are awful.

For the sake of the list: You modified the test_transaction_failure testcase, which tests an injected failure on one of two drives, both withwrites that need to be committed.

You induce the failure by removing the pending writes from drive0 suchthat drive0 has nothing it needs to commit at all.

We then timeout waiting for notification that drive0's job wascanceled... Hm, perhaps this is a race condition where drive0 finishestoo quickly to cancel?

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-block] [PATCH] backup: block-job error BUG, John Snow <=
- Re: [Qemu-block] [PATCH] backup: block-job error BUG, Vladimir Sementsov-Ogievskiy, 2016/08/02

Prev by Date: Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] ide: ide-cd without drive property for empty drive
Next by Date: Re: [Qemu-block] [PATCH 1/3] blockjob: fix dead pointer in txn list
Previous by thread: Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] ide: ide-cd without drive property for empty drive
Next by thread: Re: [Qemu-block] [PATCH] backup: block-job error BUG
Index(es):
- Date
- Thread