[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-8.0] ide: Fix manual in-flight count for TRIM BH

From: Fiona Ebner
Subject: Re: [PATCH for-8.0] ide: Fix manual in-flight count for TRIM BH
Date: Mon, 13 Mar 2023 13:29:25 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0

Am 10.03.23 um 16:13 schrieb Paolo Bonzini:
> On Fri, Mar 10, 2023 at 3:25 PM Kevin Wolf <kwolf@redhat.com> wrote:
>>> 1. The TRIM operation should be completed on the IDE level before
>>> draining ends.
>>> 2. Block layer requests issued after draining has begun are queued.
>>> To me, the conclusion seems to be:
>>> Issue all block layer requests belonging to the IDE TRIM operation up
>>> front.
>>> The other alternative I see is to break assumption 2, introduce a way
>>> to not queue certain requests while drained, and use it for the
>>> recursive requests issued by ide_issue_trim_cb. But not the initial
>>> one, if that would defeat the purpose of request queuing. Of course
>>> this can't be done if QEMU relies on the assumption in other places
>>> already.
>> I feel like this should be allowed because if anyone has exclusive
>> access in this scenario, it's IDE, so it should be able to bypass the
>> queuing. Of course, the queuing is still needed if someone else drained
>> the backend, so we can't just make TRIM bypass it in general. And if you
>> make it conditional on IDE being in blk_drain(), it already starts to
>> become ugly again...
>> So maybe the while loop is unavoidable.
>> Hmm... But could ide_cancel_dma_sync() just directly use
>> AIO_WAIT_WHILE(s->bus->dma->aiocb) instead of using blk_drain()?
> While that should work, it would not fix other uses of
> bdrv_drain_all(), for example in softmmu/cpus.c. Stopping the device
> model relies on those to run *until the device model has finished
> submitting requests*.
> So I still think that this bug is a symptom of a problem in the design
> of request queuing.
> In fact, shouldn't request queuing was enabled at the _end_ of
> bdrv_drained_begin (once the BlockBackend has reached a quiescent
> state on its own terms), rather than at the beginning (which leads to
> deadlocks like this one)?
> blk->quiesce_counter becomes just a nesting counter for
> drained_begin/end, with no uses outside, and blk_wait_while_drained
> uses a new counter. Then you have something like this in
> blk_root_drained_poll:
>     if (blk->dev_ops && blk->dev_ops->drained_poll) {
>         busy = blk->dev_ops->drained_poll(blk->dev_opaque);
>     }
>     busy |= !!blk->in_flight;
>     if (!busy) {
>        qatomic_set(&blk->queue_requests, true);
>     }
>     return busy;
> And there's no need to touch IDE at all.
Couldn't this lead to scenarios where a busy or malicious guest, which
continues to submit new requests, slows down draining or even prevents
it from finishing?

Best Regards,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]