qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 0/7] Fix scsi devices plug/unplug races w.r.t virtio-scsi


From: Maxim Levitsky
Subject: Re: [PATCH v3 0/7] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread
Date: Wed, 29 Jul 2020 22:56:19 +0300
User-agent: Evolution 3.36.3 (3.36.3-1.fc32)

On Wed, 2020-07-15 at 18:01 +0300, Maxim Levitsky wrote:
> Hi!
> 
> This is a patch series that is a result of my discussion with Paulo on
> how to correctly fix the root cause of the BZ #1812399.
> 
> The root cause of this bug is the fact that IO thread is running mostly
> unlocked versus main thread on which device hotplug is done.
> 
> qdev_device_add first creates the device object, then places it on the bus,
> and only then realizes it.
> 
> However some drivers and currently only virtio-scsi enumerate its child bus
> devices on each request that is received from the guest and that can happen 
> on the IO
> thread.
> 
> Thus we have a window when new device is on the bus but not realized and can 
> be accessed
> by the virtio-scsi driver in that state.
> 
> Fix that by doing two things:
> 
> 1. Add partial RCU protection to the list of a bus's child devices.
> This allows the scsi IO thread to safely enumerate the child devices
> while it races with the hotplug placing the device on the bus.
> 
> 2. Make the virtio-scsi driver check .realized property of the scsi device
> and avoid touching the device if it isn't
> 
> Note that in the particular bug report the issue wasn't a race but rather due
> to combination of things, the .realize code in the middle managed to trigger 
> IO on the virtqueue
> which caused the virtio-scsi driver to access the half realized device. 
> However
> since this can happen as well with real IO thread, this patch series was done,
> which fixes this as well.
> 
> Changes from V1:
>   * Patch 2 is new, as suggested by Stefan, added drain_call_rcu() to fix the 
> failing unit test,
>     make check pass now
> 
>   * Patches 6,7 are new as well: I added scsi_device_get as suggested by 
> Stefan as well, although
>     this is more a refactoring that anything else as it doesn't solve
>     an existing race.
> 
>   * Addressed most of the review feedback from V1
>     - still need to decide if we need QTAILQ_FOREACH_WITH_RCU_READ_LOCK
> 
> Changes from V2:
> 
>   * No longer RFC
>   * Addressed most of the feedback from Stefan
>   * Fixed reference count leak in patch 7 when device is about to be 
> unrealized
>   * Better testing
> 
> This series was tested by adding a virtio-scsi drive with iothread,
> then running fio stress job in the guest in a loop, and then adding/removing
> the scsi drive on the host in the loop.
> This test was failing usually on 1st iteration withouth this patch series,
> and now it seems to work smoothly.
> 
> Best regards,
>       Maxim Levitsky
> 
> Maxim Levitsky (7):
>   scsi/scsi_bus: switch search direction in scsi_device_find
>   Implement drain_call_rcu and use it in hmp_device_del
>   device-core: use RCU for list of childs of a bus
>   device-core: use atomic_set on .realized property
>   virtio-scsi: don't touch scsi devices that are not yet realized or
>     about to be un-realized
>   scsi: Add scsi_device_get
>   virtio-scsi: use scsi_device_get
> 
>  hw/core/bus.c          | 28 +++++++++++++--------
>  hw/core/qdev.c         | 56 +++++++++++++++++++++++++++++++-----------
>  hw/scsi/scsi-bus.c     | 48 +++++++++++++++++++++++++++++++-----
>  hw/scsi/virtio-scsi.c  | 47 ++++++++++++++++++++++++++++-------
>  include/hw/qdev-core.h | 11 +++++++++
>  include/hw/scsi/scsi.h |  2 ++
>  include/qemu/rcu.h     |  1 +
>  qdev-monitor.c         | 22 +++++++++++++++++
>  util/rcu.c             | 55 +++++++++++++++++++++++++++++++++++++++++
>  9 files changed, 230 insertions(+), 40 deletions(-)
> 
> -- 
> 2.26.2
> 
Very gentle ping about this patch series.

Best regards,
        Maxim Levitsky




reply via email to

[Prev in Thread] Current Thread [Next in Thread]