[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 3/3] iotests: Test commit with iothreads and ongoing I/O
From: |
Eric Blake |
Subject: |
Re: [PATCH 3/3] iotests: Test commit with iothreads and ongoing I/O |
Date: |
Thu, 18 May 2023 16:00:31 -0500 |
User-agent: |
NeoMutt/20230517 |
On Wed, May 17, 2023 at 05:28:34PM +0200, Kevin Wolf wrote:
> This tests exercises graph locking, draining, and graph modifications
> with AioContext switches a lot. Amongst others, it serves as a
> regression test for bdrv_graph_wrlock() deadlocking because it is called
> with a locked AioContext and for AioContext handling in the NBD server.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I've now confirmed the following setups with just './check
graph-changes-while-io -qcow2':
patch 3 alone => test fails with wrlock assertion (good, we're
catching the 8.0 regression where the new assertion failure is
tripping)
patch 1 and 3 => test fails differently than patch 3 alone (good,
we're exposing the fact that NBD had a pre-existing bug, regardless of
whether the added rwlock made it easier to spot)
patch 2 and 3 => test passes (good, patch 2 appears to have fixed this
particular bug, and when we are ready to revert patch 1 because we get
rid of AioContext locking we'll still be okay)
patch 1, 2, and 3 => test passes (good, we fixed the NBD bug, and have
regression testing in place for a scenario that previously wasn't
getting good testing)
As such, I'm happy to supply:
Tested-by: Eric Blake <eblake@redhat.com>
Now on to the patch itself...
> ---
> tests/qemu-iotests/iotests.py | 4 ++
> .../qemu-iotests/tests/graph-changes-while-io | 56 +++++++++++++++++--
> .../tests/graph-changes-while-io.out | 4 +-
> 3 files changed, 58 insertions(+), 6 deletions(-)
>
> @@ -84,6 +84,54 @@ class TestGraphChangesWhileIO(QMPTestCase):
>
> bench_thr.join()
>
> + def test_commit_while_io(self) -> None:
> + # Run qemu-img bench in the background
> + bench_thr = Thread(target=do_qemu_img_bench, args=(200000, ))
TIL - you can create a 1-item tuple in Python. It caught me
off-guard, but makes sense now that I've re-read it.
> +
> + # While qemu-img bench is running, repeatedly commit overlay to node0
> + while bench_thr.is_alive():
> + result = self.qsd.qmp('block-commit', {
> + 'job-id': 'job0',
> + 'device': 'overlay',
> + })
> + self.assert_qmp(result, 'return', {})
> +
> + result = self.qsd.qmp('block-job-cancel', {
> + 'device': 'job0',
> + })
> + self.assert_qmp(result, 'return', {})
> +
> + cancelled = False
> + while not cancelled:
> + for event in self.qsd.get_qmp().get_events(wait=10.0):
The updated test took about 34 seconds on my machine; long enough that
it is rightfully not part of './check -g quick', but still reasonable
that I had no problems reproducing the issue while the test was
running.
> + if event['event'] != 'JOB_STATUS_CHANGE':
> + continue
> + if event['data']['status'] == 'null':
> + cancelled = True
> +
> + bench_thr.join()
> +
It feels a bit odd that the test is skipped during './check -nbd', yet
it IS utilizing nbd, and the fix in patch 2 is indeed in NBD code.
But that's not a flaw in the test itself, just a limitation of what
images we need in order to set up the NBD service in a way to trigger
the problem.
Reviewed-by: Eric Blake <eblake@redhat.com>
I'm in a spot where I can quickly queue this through my NBD tree so we
can get it backported to 8.0.1; pull request coming up, provided the
full series passes a few more tests on my end.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org