Re: Deadlock with ide_issue

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Deadlock with ide_issue_trim and draining

From:	Hanna Czenczek
Subject:	Re: Deadlock with ide_issue_trim and draining
Date:	Wed, 8 Mar 2023 19:03:58 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1

On 07.03.23 14:44, Hanna Czenczek wrote:

On 07.03.23 13:22, Fiona Ebner wrote:

Hi,
I am suspecting that commit 7e5cdb345f ("ide: Increment BB in-flight
counter for TRIM BH") introduced an issue in combination with draining.

 From a debug session on a costumer's machine I gathered the following
information:
* The QEMU process hangs in aio_poll called during draining and doesn't
progress.
* The in_flight counter for the BlockDriverState is 0 and for the
BlockBackend it is 1.
* There is a blk_aio_pdiscard_entry request in the BlockBackend's
queued_requests.
* The drive is attached via ahci.

I suspect that something like the following happened:

1. ide_issue_trim is called, and increments the in_flight counter.
2. ide_issue_trim_cb calls blk_aio_pdiscard.
3. somebody else starts draining.
4. ide_issue_trim_cb is called as the completion callback for
blk_aio_pdiscard.
5. ide_issue_trim_cb issues yet another blk_aio_pdiscard request.
6. The request is added to the wait queue via blk_wait_while_drained,
because draining has been started.

7. Nobody ever decrements the in_flight counter and draining can'tfinish.


Sounds about right.

The issue occurs very rarely and is difficult to reproduce, but with the
help of GDB, I'm able to do it rather reliably:
1. Use GDB to break on blk_aio_pdiscard.
2. Run mkfs.ext4 on a huge disk in the guest.
3. Issue a drive-backup QMP command after landing on the breakpoint.
4. Continue a few times in GDB.
5. After that I can observe the same situation as described above.

I'd be happy about suggestions for how to fix it. Unfortunately, I don't
see a clear-cut way at the moment. The only idea I have right now is to
change the code to issue all discard requests at the same time, but I
fear there might pitfalls with that?

The point of 7e5cdb345f was that we need any in-flight count toaccompany a set s->bus->dma->aiocb. While blk_aio_pdiscard() ishappening, we don’t necessarily need another count. But we do need itwhile there is no blk_aio_pdiscard().

ide_issue_trim_cb() returns in two cases (and, recursively through itscallers, leaves s->bus->dma->aiocb set):

1. After calling blk_aio_pdiscard(), which will keep an in-flight count,

2. After calling replay_bh_schedule_event() (i.e. qemu_bh_schedule()),which does not keep an in-flight count.

Perhaps we just need to move the blk_inc_in_flight() above thereplay_bh_schedule_event() call?

While writing the commit message for this, I noticed it isn’t quiteright: ide_cancel_dma_sync() drains s->blk only once, so once thein-flight counter goes to 0, s->blk is considered drained andide_cancel_dma_sync() will go on to assert that s->bus->dma->aiocb isnow NULL. However, if we do have a blk_aio_pdiscard() in flight, thedrain will wait only for that one to complete, not for the whole trimoperation to complete, i.e. the next discard or ide_trim_bh_cb() will bescheduled, but neither will necessarily be run before blk_drain() returns.

I’ve attached a reproducer that issues two trim requests. Before7e5cdb345f, it makes qemu crash because the assertion fails (one or twoof the blk_aio_pdiscard()s is drained, but the trim isn’t settled yet). After 7e5cdb345f, qemu hangs because of what you describe (the secondblk_aio_pdiscard() is enqueued, so the drain can’t make progress,resulting in a deadlock). With my proposed fix, qemu crashes again.


(Reproducer is run like this:
$ qemu-system-x86_64 -drive if=ide,file=/tmp/test.bin,format=raw
)

What comes to my mind is either what you’ve proposed initially (issuingall discards simultaneously), or to still use my proposed fix, but alsohave ide_cancel_dma_sync() run blk_drain() in a loop untils->bus->dma->aiocb becomes NULL. (Kind of like my original patch(https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html),only that we can still use blk_drain() instead of aio_poll() because weincrement the in-flight counter while the completion BH is scheduled.)


Hanna

test.bin
Description: Binary data

test.asm
Description: Text document

[Prev in Thread]

Current Thread

[Next in Thread]

Deadlock with ide_issue_trim and draining, Fiona Ebner, 2023/03/07
- Re: Deadlock with ide_issue_trim and draining, Hanna Czenczek, 2023/03/07
  - Re: Deadlock with ide_issue_trim and draining, Hanna Czenczek, 2023/03/07
    - Re: Deadlock with ide_issue_trim and draining, Fiona Ebner, 2023/03/08
    - Re: Deadlock with ide_issue_trim and draining, Hanna Czenczek, 2023/03/08
  - Re: Deadlock with ide_issue_trim and draining, Hanna Czenczek <=

Prev by Date: Re: [PATCH 1/6] block: don't acquire AioContext lock in bdrv_drain_all()
Next by Date: [PATCH] block: add missing coroutine_fn to bdrv_sum_allocated_file_size()
Previous by thread: Re: Deadlock with ide_issue_trim and draining
Next by thread: [PATCH] pflash: Fix blk_pread_nonzeroes()
Index(es):
- Date
- Thread