qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin()


From: Fiona Ebner
Subject: Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin()
Date: Fri, 3 Nov 2023 14:12:04 +0100
User-agent: Mozilla Thunderbird

Hi,

Am 30.05.23 um 18:32 schrieb Kevin Wolf:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Detach ioeventfds during drained sections to stop I/O submission from
> the guest. virtio-blk is no longer reliant on aio_disable_external()
> after this patch. This will allow us to remove the
> aio_disable_external() API once all other code that relies on it is
> converted.
> 
> Take extra care to avoid attaching/detaching ioeventfds if the data
> plane is started/stopped during a drained section. This should be rare,
> but maybe the mirror block job can trigger it.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Message-Id: <20230516190238.8401-18-stefanha@redhat.com>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

I ran into a strange issue where guest IO would get completely stuck
during certain block jobs a while ago and finally managed to find a
small reproducer [0]. I'm using a VM with virtio-blk-pci (or
virtio-scsi-pci) with an iothread and running

fio --name=file --size=100M --direct=1 --rw=randwrite --bs=4k
--ioengine=psync --numjobs=5 --runtime=1200 --time_based

in the guest. Then I'm issuing the QMP command with the reproducer in a
loop. Usually, the guest IO will get stuck after about 1-3 minutes,
sometimes fio can manage to continue with a lower speed for a while (but
trying to Ctrl+C it or doing other IO in the guest will already be
broken), which I guess could be a hint that it's an issue with notifiers?

Bisecting (to declare a commit good, I waited 10 minutes) led me to this
patch, i.e. commit 1665d9326f ("virtio-blk: implement
BlockDevOps->drained_begin()") and for SCSI, I verified that the issue
similarly starts happening after 766aa2de0f ("virtio-scsi: implement
BlockDevOps->drained_begin()").

Both issues are still present on current master (i.e. 1c98a821a2
("tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device"))

Happy to provide more information and hints about how to debug the issue
further.

Best Regards,
Fiona

[0]:

> diff --git a/blockdev.c b/blockdev.c
> index db2725fe74..bf2e0fc22c 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2986,6 +2986,11 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>      bool zero_target;
>      int ret;
>  
> +    bdrv_drain_all_begin();
> +    bdrv_drain_all_end();
> +    return;
> +
> +
>      bs = qmp_get_root_bs(arg->device, errp);
>      if (!bs) {
>          return;




reply via email to

[Prev in Thread] Current Thread [Next in Thread]