[Bug 1681439] Re: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma

qemu-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug 1681439] Re: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_

From:	John Snow
Subject:	[Bug 1681439] Re: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion `s->bus->dma->aiocb == NULL' failed.
Date:	Wed, 04 Nov 2020 23:48:28 -0000
TLDR: I am not actively working on this, because the problem extends
well below IDE and I don't have the bandwidth to take point on this at
the moment.

Here's a writeup I sent to qemu-devel on 2020-07-30:


First, the (partially bogus, fuzzer-generated) IDE command wants to:

1. dma write 259 sectors starting at sector 1

2. Provides a PRDT at addr 0x00 whose first PRDT describes a data buffer
at 0xffffffff of length 0x10000. [a]

3. The remaining 8,191 PRD entries are uninitialized memory that all
wind up describing the same data buffer at 0x00 of length 0x10000.

Generally, the entire PRDT is going to be read, but truncated into an
SGList that's exactly as long as the IDE command. Here, that's 0x20600
bytes.

Yadda, yadda, yadda, that winds up turning into these map requests:

addr 0xffffffff; len 0x10000
  -- mapped length: 0x01 (normal map return)

addr 0x100000000; len 0xffff
  -- mapped length: 0x1000 (bounce buffer return)

addr 0x100001000; len 0xefff
  -- bounce buffer is busy, cannot map

Then it proceeds and calls the iofunc. We return to dma_blk_cb and then:

unmap 0xffffffff; len 0x01; access_len 0x01;

... That unmaps the "direct" one, but we seemingly fail to unmap the
indirect one.

Uh, cool. When we build the IOV, we build it with two entries; but
qemu_iovec_discard_back discards the second entry entirely without
unmapping it.

IDE asks for an alignment of BDRV_SECTOR_SIZE (512 bytes). The IDE state
machine transfers an entire sector or nothing at all. The total IOV size
we have build thus far is 0x1001 bytes, which is not aligned as you
might have noticed.

So, we try to align it:

qemu_iovec_discard_back(&dbs->iov, QEMU_ALIGN_DOWN(4097, 512))

... I think we probably wanted to ask to shave off one byte instead of
asking to shave off 4096 bytes.


So, a few perceived problems with dma_blk_cb:

1. Our alignment math is wrong. discard_back takes as an argument the
number of bytes to discard, not the number of bytes you want to have
afterwards.

2. qemu_iovec_discard_back will happily unwind entire IO vectors that we
would need to unmap and have now lost. Worse, whenever we do any kind of
truncating at all, those bytes are not re-added to the source SGList, so
subsequent transfers will have skipped some bytes in the guest SGList.

3. the dma_blk_io interfaces don't ever check to see if the sg list is
an even multiple of the alignment. They don't return synchronous error
and no callers check for an error case. (Though BMDMA does carefully
prepare the SGList such that it is aligned in this way. AHCI does too,
IIRC.) This means we might have an unaligned tail that we will just drop
or ignore, leading to another partial DMA.

4. There's no guarantee that any given individual IO vector will have an
entire sector's worth of data in it. It is theoretically valid to
describe a series of vectors of two bytes each. If we can only map 1-2
vectors at a time, depending, we're never going to be able to scrounge
up enough buffer real estate to transfer an entire sector.


[a] This is against the BMDMA spec. The address must be aligned to 0x02
and cannot cross a 64K boundary. bit0 is documented as always being
zero, but it's not clear what should happen if the boundary constraint
is violated. Absent other concerns, it might just be easiest to fulfill
the transfer if it's possible.


** Changed in: qemu
     Assignee: John Snow (jnsnow) => (unassigned)

** Changed in: qemu
       Status: In Progress => Confirmed

** Summary changed:

- qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion 
`s->bus->dma->aiocb == NULL' failed.
+ dma_blk_cb leaks memory map handles on misaligned IO

** Description changed:

+ Maintainer Edit:
+ 
+ The functions in dma-helpers mismanage misaligned IO, badly enough to
+ cause an infinite loop where no progress can be made. This allows the
+ IDE state machine to get wedged such that cancelling DMA can fail;
+ because the DMA helpers have bodged the state of the DMA transfer. See
+ Comment #15 for the in-depth analysis.
+ 
+ I've updated the name of this bug to reflect the current status as I
+ understand it.
+ 
+ --js
+ 
+ 
+ Original report:
+ 
  Since upgrading to QEMU 2.8.0, my Windows 7 64-bit virtual machines
  started crashing due to the assertion quoted in the summary failing.
  The assertion in question was added by commit 9972354856 ("block: add
  BDS field to count in-flight requests").  My tests show that setting
  discard=unmap is needed to reproduce the issue.  Speaking of
  reproduction, it is a bit flaky, because I have been unable to come up
  with specific instructions that would allow the issue to be triggered
  outside of my environment, but I do have a semi-sane way of testing that
  appears to depend on a specific initial state of data on the underlying
  storage volume, actions taken within the VM and waiting for about 20
  minutes.
  
  Here is the shortest QEMU command line that I managed to reproduce the
  bug with:
  
-     qemu-system-x86_64 \
-         -machine pc-i440fx-2.7,accel=kvm \
-         -m 3072 \
-         -drive file=/dev/lvm/qemu,format=raw,if=ide,discard=unmap \
-       -netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no,vhost=on \
-         -device virtio-net-pci,netdev=hostnet0 \
-       -vnc :0
+     qemu-system-x86_64 \
+         -machine pc-i440fx-2.7,accel=kvm \
+         -m 3072 \
+         -drive file=/dev/lvm/qemu,format=raw,if=ide,discard=unmap \
+  -netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no,vhost=on \
+         -device virtio-net-pci,netdev=hostnet0 \
+  -vnc :0
  
  The underlying storage (/dev/lvm/qemu) is a thin LVM snapshot.
  
  QEMU was compiled using:
  
-     ./configure --python=/usr/bin/python2.7 --target-list=x86_64-softmmu
-     make -j3
+     ./configure --python=/usr/bin/python2.7 --target-list=x86_64-softmmu
+     make -j3
  
  My virtualization environment is not really a critical one and
  reproduction is not that much of a hassle, so if you need me to gather
  further diagnostic information or test patches, I will be happy to help.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1681439

Title:
  dma_blk_cb leaks memory map handles on misaligned IO

Status in QEMU:
  Confirmed

Bug description:
  Maintainer Edit:

  The functions in dma-helpers mismanage misaligned IO, badly enough to
  cause an infinite loop where no progress can be made. This allows the
  IDE state machine to get wedged such that cancelling DMA can fail;
  because the DMA helpers have bodged the state of the DMA transfer. See
  Comment #15 for the in-depth analysis.

  I've updated the name of this bug to reflect the current status as I
  understand it.

  --js

  
  Original report:

  Since upgrading to QEMU 2.8.0, my Windows 7 64-bit virtual machines
  started crashing due to the assertion quoted in the summary failing.
  The assertion in question was added by commit 9972354856 ("block: add
  BDS field to count in-flight requests").  My tests show that setting
  discard=unmap is needed to reproduce the issue.  Speaking of
  reproduction, it is a bit flaky, because I have been unable to come up
  with specific instructions that would allow the issue to be triggered
  outside of my environment, but I do have a semi-sane way of testing that
  appears to depend on a specific initial state of data on the underlying
  storage volume, actions taken within the VM and waiting for about 20
  minutes.

  Here is the shortest QEMU command line that I managed to reproduce the
  bug with:

      qemu-system-x86_64 \
          -machine pc-i440fx-2.7,accel=kvm \
          -m 3072 \
          -drive file=/dev/lvm/qemu,format=raw,if=ide,discard=unmap \
   -netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no,vhost=on \
          -device virtio-net-pci,netdev=hostnet0 \
   -vnc :0

  The underlying storage (/dev/lvm/qemu) is a thin LVM snapshot.

  QEMU was compiled using:

      ./configure --python=/usr/bin/python2.7 --target-list=x86_64-softmmu
      make -j3

  My virtualization environment is not really a critical one and
  reproduction is not that much of a hassle, so if you need me to gather
  further diagnostic information or test patches, I will be happy to help.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1681439/+subscriptions
[Prev in Thread]
Current Thread
[Next in Thread]
[Bug 1681439] Re: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion `s->bus->dma->aiocb == NULL' failed., John Snow <=
Prev by Date: Re: [QEMU] Question regarding user mode support for ARM syscalls
Next by Date: Re: Migrating to the gitlab issue tracker
Previous by thread: [Bug 1835865] Re: piix crashes on mips when accessing acpi-pci-hotplug
Next by thread: [PULL for-5.2 0/2] tcg patch queue
Index(es):
- Date
- Thread