qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Deadlock with ide_issue_trim and draining


From: Hanna Czenczek
Subject: Re: Deadlock with ide_issue_trim and draining
Date: Wed, 8 Mar 2023 16:02:51 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1

On 08.03.23 11:35, Fiona Ebner wrote:
Am 07.03.23 um 15:27 schrieb Hanna Czenczek:
On 07.03.23 14:44, Hanna Czenczek wrote:
On 07.03.23 13:22, Fiona Ebner wrote:
Hi,
I am suspecting that commit 7e5cdb345f ("ide: Increment BB in-flight
counter for TRIM BH") introduced an issue in combination with draining.

  From a debug session on a costumer's machine I gathered the following
information:
* The QEMU process hangs in aio_poll called during draining and doesn't
progress.
* The in_flight counter for the BlockDriverState is 0 and for the
BlockBackend it is 1.
* There is a blk_aio_pdiscard_entry request in the BlockBackend's
queued_requests.
* The drive is attached via ahci.

I suspect that something like the following happened:

1. ide_issue_trim is called, and increments the in_flight counter.
2. ide_issue_trim_cb calls blk_aio_pdiscard.
3. somebody else starts draining.
4. ide_issue_trim_cb is called as the completion callback for
blk_aio_pdiscard.
5. ide_issue_trim_cb issues yet another blk_aio_pdiscard request.
6. The request is added to the wait queue via blk_wait_while_drained,
because draining has been started.
7. Nobody ever decrements the in_flight counter and draining can't
finish.
Sounds about right.

The issue occurs very rarely and is difficult to reproduce, but with the
help of GDB, I'm able to do it rather reliably:
1. Use GDB to break on blk_aio_pdiscard.
2. Run mkfs.ext4 on a huge disk in the guest.
3. Issue a drive-backup QMP command after landing on the breakpoint.
4. Continue a few times in GDB.
5. After that I can observe the same situation as described above.

I'd be happy about suggestions for how to fix it. Unfortunately, I don't
see a clear-cut way at the moment. The only idea I have right now is to
change the code to issue all discard requests at the same time, but I
fear there might pitfalls with that?
The point of 7e5cdb345f was that we need any in-flight count to
accompany a set s->bus->dma->aiocb.  While blk_aio_pdiscard() is
happening, we don’t necessarily need another count.  But we do need it
while there is no blk_aio_pdiscard().

ide_issue_trim_cb() returns in two cases (and, recursively through its
callers, leaves s->bus->dma->aiocb set):
1. After calling blk_aio_pdiscard(), which will keep an in-flight count,
2. After calling replay_bh_schedule_event() (i.e. qemu_bh_schedule()),
which does not keep an in-flight count.

Perhaps we just need to move the blk_inc_in_flight() above the
replay_bh_schedule_event() call?
FWIW, doing so at least keeps the reproducer from
https://bugzilla.redhat.com/show_bug.cgi?id=2029980 working.

And I'm not able to run into my current issue anymore :) Thank you!

Great! :)

FWIW, the suggested change and explanation sound good to me. Are you
going to send a patch for it?

Sure, can do.

Hanna




reply via email to

[Prev in Thread] Current Thread [Next in Thread]