Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread

From:	Kevin Wolf
Subject:	Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread
Date:	Fri, 17 Feb 2023 11:22:02 +0100

Am 10.02.2023 um 15:32 hat Stefan Hajnoczi geschrieben:
> When an IOThread is configured, the ctrl virtqueue is processed in the
> IOThread. TMFs that reset SCSI devices are currently called directly
> from the IOThread and trigger an assertion failure in blk_drain():
> 
>   ../block/block-backend.c:1780: void blk_drain(BlockBackend *): Assertion 
> `qemu_in_main_thread()' failed.
> 
> The blk_drain() function is not designed to be called from an IOThread
> because it needs the Big QEMU Lock (BQL).
> 
> This patch defers TMFs that reset SCSI devices to a Bottom Half (BH)
> that runs in the main loop thread under the BQL. This way it's safe to
> call blk_drain() and the assertion failure is avoided.

It's not entirely obvious what is the call path that leads to
blk_drain(). Do we somehow call into virtio_scsi_dataplane_stop()?

> Introduce s->tmf_bh_list for tracking TMF requests that have been
> deferred to the BH. When the BH runs it will grab the entire list and
> process all requests. Care must be taken to clear the list when the
> virtio-scsi device is reset or unrealized. Otherwise deferred TMF
> requests could execute later and lead to use-after-free or other
> undefined behavior.

Why don't we already need the same for other asynchronously processed
requests? Certainly having a read request write to guest memory after
the device has been reset or unplugged isn't what we want either.

I see that we assert(!s->dataplane_started) in virtio_scsi_reset(),
which may be part of the reason. If we're not processing any requests,
then we're safe. virtio_scsi_dataplane_stop() calls blk_drain_all()
(which is really a too big hammer) in order to make sure that in-flight
requests are completed before dataplane_started becomes false.

I was wondering if we couldn't just blk_inc_in_flight() while a TMF
request is in flight and then use the same draining logic to be covered.
You could use oneshot BHs then and do away with the list because you
don't need to cancel anything any more, you just wait until the BHs have
completed.

The practical problem may be that we don't have a blk here (which is
probably also why blk_drain_all() is used). We could have our own
AIO_WAIT_WHILE() instead. I feel waiting instead of cancelling BHs would
simplify the code.

In fact, I think technically, you may not need any of that because
blk_drain_all() also executes all BHs in the main loop before it
returns, but that might be a bit subtle...

> The s->resetting counter that's used by TMFs that reset SCSI devices is
> accessed from multiple threads. This patch makes that explicit by using
> atomic accessor functions. With this patch applied the counter is only
> modified by the main loop thread under the BQL but can be read by any
> thread.
> 
> Reported-by: Qing Wang <qinwang@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Kevin

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v2 0/3] virtio-scsi: fix SCSIDevice hot unplug with IOThread, Stefan Hajnoczi, 2023/02/10
- [PATCH v2 1/3] scsi: protect req->aiocb with AioContext lock, Stefan Hajnoczi, 2023/02/10
  - Re: [PATCH v2 1/3] scsi: protect req->aiocb with AioContext lock, Eric Blake, 2023/02/15
  - Re: [PATCH v2 1/3] scsi: protect req->aiocb with AioContext lock, Kevin Wolf, 2023/02/16
- [PATCH v2 2/3] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race, Stefan Hajnoczi, 2023/02/10
  - Re: [PATCH v2 2/3] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race, Eric Blake, 2023/02/15
  - Re: [PATCH v2 2/3] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race, Kevin Wolf, 2023/02/16
    - Re: [PATCH v2 2/3] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race, Stefan Hajnoczi, 2023/02/16
- [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread, Stefan Hajnoczi, 2023/02/10
  - Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread, Eric Blake, 2023/02/15
  - Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread, Kevin Wolf <=
    - Message not available
    - Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread, Stefan Hajnoczi, 2023/02/23

Prev by Date: Re: [PATCH 00/23] block: Lock the graph, part 2 (BlockDriver callbacks)
Next by Date: Re: [PATCH 2/5] spapr/ddw: Remove confuse return value in spapr_phb_get_free_liobn()
Previous by thread: Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread
Next by thread: Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread
Index(es):
- Date
- Thread