qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC] block-backend: prevent dangling BDS pointer in blk_drain()


From: Stefan Hajnoczi
Subject: [RFC] block-backend: prevent dangling BDS pointer in blk_drain()
Date: Thu, 9 Dec 2021 14:23:04 +0000

The BlockBackend root child can change during bdrv_drained_begin() when
aio_poll() is invoked. In fact the BlockDriverState can reach refcnt 0
and blk_drain() is left with a dangling BDS pointer.

One example is scsi_device_purge_requests(), which calls blk_drain() to
wait for in-flight requests to cancel. If the backup blockjob is active,
then the BlockBackend root child is a temporary filter BDS owned by the
blockjob. The blockjob can complete during bdrv_drained_begin() and the
last reference to the BDS is released when the temporary filter node is
removed. This results in a use-after-free when blk_drain() calls
bdrv_drained_end(bs) on the dangling pointer.

The general problem is that a function and its callers must not assume
that bs is still valid across aio_poll(). Explicitly hold a reference to
bs in blk_drain() to avoid the dangling pointer.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
I found that BDS nodes are sometimes deleted with bs->quiesce_counter >
0 (at least when running "make check"), so it is currently not possible
to put the bdrv_ref/unref() calls in bdrv_do_drained_begin() and
bdrv_do_drained_end() because they will be unbalanced. That would have
been a more general solution than only fixing blk_drain().

Any suggestions for a better fix?

I think it's likely that more "dangling pointer across aio_poll()"
problems exist :(.

Here is the (hacky) reproducer:

  build/qemu-system-x86_64 \
     -name 'avocado-vt-vm1'  \
     -sandbox on  \
     -machine q35,memory-backend=mem-machine_mem \
     -device 
pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1
 \
     -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 
 \
     -nodefaults \
     -device VGA,bus=pcie.0,addr=0x2 \
     -m 1024 \
     -object memory-backend-ram,size=1024M,id=mem-machine_mem  \
     -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2  \
     -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
     -chardev 
socket,wait=off,server=on,id=qmp_id_qmpmonitor1,path=/tmp/qmp.sock  \
     -mon chardev=qmp_id_qmpmonitor1,mode=control \
     -chardev 
socket,wait=off,server=on,id=qmp_id_catch_monitor,path=/tmp/catch_monitor.sock  
\
     -mon chardev=qmp_id_catch_monitor,mode=control \
     -device pvpanic,ioport=0x505,id=idgKHYrQ \
     -chardev 
socket,wait=off,server=on,id=chardev_serial0,path=/tmp/serial.sock \
     -device isa-serial,id=serial0,chardev=chardev_serial0  \
     -chardev 
socket,id=seabioslog_id_20211110-012521-TNCkxDmn,path=/tmp/seabios.sock,server=on,wait=off
 \
     -device 
isa-debugcon,chardev=seabioslog_id_20211110-012521-TNCkxDmn,iobase=0x402 \
     -device 
pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
     -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
     -blockdev 
node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=test.img,cache.direct=on,cache.no-flush=off
 \
     -blockdev 
node-name=drive_image1,driver=raw,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
 \
     -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
     -blockdev 
node-name=file_src1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=sr1.qcow2,cache.direct=on,cache.no-flush=off
 \
     -blockdev 
node-name=drive_src1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_src1
 \
     -device scsi-hd,id=src1,drive=drive_src1,write-cache=on \
     -device 
pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
     -device 
virtio-net-pci,mac=9a:11:64:b0:5d:a8,id=idxnEEYY,netdev=idBjpylo,bus=pcie-root-port-3,addr=0x0
  \
     -netdev user,id=idBjpylo  \
     -vnc :0  \
     -rtc base=utc,clock=host,driftfix=slew  \
     -boot menu=off,order=cdn,once=c,strict=off \
     -enable-kvm \
     -device 
pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5
 &

  sleep 8 # delay for VM startup and socket creation

  nc -U /tmp/qmp.sock <<EOF
  {'execute': 'qmp_capabilities'}
  {'execute': 'blockdev-create', 'arguments': {'options': {'driver': 'file', 
'filename': 'dst1.qcow2', 'size': 209715200}, 'job-id': 'file_dst1'}, 'id': 
'Fk1bF3FV'}
  EOF
  sleep 1 # wait for blockdev-create completion
  nc -U /tmp/qmp.sock <<EOF
  {'execute': 'qmp_capabilities'}
  {'execute': 'job-dismiss', 'arguments': {'id': 'file_dst1'}, 'id': '13R5TDSj'}
  {'execute': 'blockdev-add', 'arguments': {'node-name': 'file_dst1', 'driver': 
'file', 'filename': 'dst1.qcow2', 'aio': 'threads', 'auto-read-only': true, 
'discard': 'unmap'}, 'id': 'VIzrN0zy'}
  {'execute': 'blockdev-create', 'arguments': {'options': {'driver': 'qcow2', 
'file': 'file_dst1', 'size': 209715200}, 'job-id': 'drive_dst1'}, 'id': 
'YX8t8hBs'}
  EOF
  sleep 1 # wait for blockdev-create completion
  nc -U /tmp/qmp.sock <<EOF
  {'execute': 'qmp_capabilities'}
  {'execute': 'job-dismiss', 'arguments': {'id': 'drive_dst1'}, 'id': 
'OTZwYb7J'}
  {'execute': 'blockdev-add', 'arguments': {'node-name': 'drive_dst1', 
'driver': 'qcow2', 'file': 'file_dst1', 'read-only': false}, 'id': 'QHyUxtql'}
  {'execute': 'system_reset', 'id': 'OREutgnz'}
  {'execute': 'blockdev-backup', 'arguments': {'device': 'drive_src1', 
'target': 'drive_dst1', 'job-id': 'drive_src1_qnFF', 'sync': 'full', 'speed': 
0, 'compress': false, 'auto-finalize': true, 'auto-dismiss': true, 
'on-source-error': 'report', 'on-target-error': 'report'}, 'id': 'WbDARa8c'}
  EOF
---
 block/block-backend.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 12ef80ea17..5608c0451b 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1705,6 +1705,7 @@ void blk_drain(BlockBackend *blk)
     BlockDriverState *bs = blk_bs(blk);
 
     if (bs) {
+        bdrv_ref(bs);
         bdrv_drained_begin(bs);
     }
 
@@ -1714,6 +1715,7 @@ void blk_drain(BlockBackend *blk)
 
     if (bs) {
         bdrv_drained_end(bs);
+        bdrv_unref(bs);
     }
 }
 
-- 
2.33.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]