qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 10/66] tests/qtest: enable tests for virtio-scmi


From: Milan Zamazal
Subject: Re: [PULL 10/66] tests/qtest: enable tests for virtio-scmi
Date: Thu, 20 Jul 2023 10:34:47 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Fabiano Rosas <farosas@suse.de> writes:

> Thomas Huth <thuth@redhat.com> writes:
>
>> On 18/07/2023 14.55, Milan Zamazal wrote:
>>> Thomas Huth <thuth@redhat.com> writes:
>>> 
>>>> On 11/07/2023 01.02, Michael S. Tsirkin wrote:
>>>>> From: Milan Zamazal <mzamazal@redhat.com>
>>>>> We don't have a virtio-scmi implementation in QEMU and only support
>>>>
>>>>> a
>>>>> vhost-user backend.  This is very similar to virtio-gpio and we add the 
>>>>> same
>>>>> set of tests, just passing some vhost-user messages over the control 
>>>>> socket.
>>>>> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
>>>>> Acked-by: Thomas Huth <thuth@redhat.com>
>>>>> Message-Id: <20230628100524.342666-4-mzamazal@redhat.com>
>>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>> ---
>>>>>    tests/qtest/libqos/virtio-scmi.h |  34 ++++++
>>>>>    tests/qtest/libqos/virtio-scmi.c | 174 +++++++++++++++++++++++++++++++
>>>>>    tests/qtest/vhost-user-test.c    |  44 ++++++++
>>>>>    MAINTAINERS                      |   1 +
>>>>>    tests/qtest/libqos/meson.build   |   1 +
>>>>>    5 files changed, 254 insertions(+)
>>>>>    create mode 100644 tests/qtest/libqos/virtio-scmi.h
>>>>>    create mode 100644 tests/qtest/libqos/virtio-scmi.c
>>>>
>>>>   Hi!
>>>>
>>>> I'm seeing some random failures with this new scmi test, so far only
>>>> on non-x86 systems, e.g.:
>>>>
>>>>   https://app.travis-ci.com/github/huth/qemu/jobs/606246131#L4774
>>>>
>>>> It also reproduces on a s390x host here, but only if I run "make check
>>>> -j$(nproc)" - if I run the tests single-threaded, the qos-test passes
>>>> there. Seems like there is a race somewhere in this test?
>>> 
>>> Hmm, it's basically the same as virtio-gpio.c test, so it should be OK.
>>> Is it possible that the two tests (virtio-gpio.c & virtio-scmi.c)
>>> interfere with each other in some way?  Is there possibly a way to
>>> serialize them to check?
>>
>> I think within one qos-test, the sub-tests are already run
>> serialized. But there might be multiple qos-tests running in
>> parallel, e.g. one for the aarch64 target and one for the ppc64
>> target. And indeed, I can reproduce the problem on my x86 laptop by
>> running this in one terminal window:
>>
>> for ((x=0;x<1000;x++)); do \
>>   QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
>>   G_TEST_DBUS_DAEMON=.tests/dbus-vmstate-daemon.sh \
>>   QTEST_QEMU_BINARY=./qemu-system-ppc64 \
>>   MALLOC_PERTURB_=188 QTEST_QEMU_IMG=./qemu-img \
>>   tests/qtest/qos-test -p \
>>   
>> /ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile
>> \
>>   || break ; \
>> done
>>
>> And this in another terminal window at the same time:
>>
>> for ((x=0;x<1000;x++)); do \
>>   QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
>>   G_TEST_DBUS_DAEMON=.tests/dbus-vmstate-daemon.sh \
>>   QTEST_QEMU_BINARY=./qemu-system-aarch64 \
>>   MALLOC_PERTURB_=188 QTEST_QEMU_IMG=./qemu-img \
>>   tests/qtest/qos-test -p \
>>   
>> /aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile
>> \
>>   || break ; \
>> done
>>
>> After a while, the aarch64 test broke with:
>>
>> /aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile:
>> qemu-system-aarch64: Failed to set msg fds.
>> qemu-system-aarch64: Failed to set msg fds.
>> qemu-system-aarch64: vhost VQ 0 ring restore failed: -22: Invalid argument 
>> (22)
>> qemu-system-aarch64: Failed to set msg fds.
>> qemu-system-aarch64: vhost VQ 1 ring restore failed: -22: Invalid argument 
>> (22)
>> qemu-system-aarch64: Failed to set msg fds.
>> qemu-system-aarch64: vhost_set_vring_call failed 22
>> qemu-system-aarch64: Failed to set msg fds.
>> qemu-system-aarch64: vhost_set_vring_call failed 22
>> qemu-system-aarch64: Failed to write msg. Wrote -1 instead of 20.
>> qemu-system-aarch64: Failed to set msg fds.
>> qemu-system-aarch64: vhost VQ 0 ring restore failed: -22: Invalid argument 
>> (22)
>> qemu-system-aarch64: Failed to set msg fds.
>> qemu-system-aarch64: vhost VQ 1 ring restore failed: -22: Invalid argument 
>> (22)
>> qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
>> msix_unset_vector_notifiers: Assertion
>> `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier'
>> failed.
>> ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected
>> QEMU death from signal 6 (Aborted) (core dumped)
>
> If it helps,

It helps a lot, thank you!

> it looks like msix_unset_vector_notifiers is being called twice, once
> from vu_scmi_set_status() and another from vu_scmi_disconnect():

Interesting.  Usually, vu_scmi_stop is called only once, which explains
why the test regularly passes.  Both the vu_scmi_stop callers have a
check protecting from duplicate vu_scmi_stop calls but it's perhaps not
fully reliable.  I can see vhost-user-gpio has an extra protection.
I'll post a patch adding a similar thing, hopefully it will fix the
problem.

> msix_unset_vector_notifiers
> virtio_pci_set_guest_notifiers
> vu_scmi_stop
> vu_scmi_disconnect   <-
> vu_scmi_event
> chr_be_event
> qemu_chr_be_event
> tcp_chr_disconnect_locked
> tcp_chr_write
> qemu_chr_write_buffer
>
> msix_unset_vector_notifiers
> virtio_pci_set_guest_notifiers
> vu_scmi_stop
> vu_scmi_set_status   <-
> virtio_set_status
> virtio_vmstate_change
> vm_state_notify
> do_vm_stop
> vm_shutdown
> qemu_cleanup




reply via email to

[Prev in Thread] Current Thread [Next in Thread]