[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-8.0] ide: Fix manual in-flight count for TRIM BH

From: Fiona Ebner
Subject: Re: [PATCH for-8.0] ide: Fix manual in-flight count for TRIM BH
Date: Fri, 10 Mar 2023 14:05:11 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0

Am 09.03.23 um 18:46 schrieb Kevin Wolf:
> Am 09.03.2023 um 14:59 hat Paolo Bonzini geschrieben:
>> On 3/9/23 13:31, Hanna Czenczek wrote:
>>> On 09.03.23 13:08, Paolo Bonzini wrote:
>>>> On Thu, Mar 9, 2023 at 1:05 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>> I think having to do this is problematic, because the blk_drain should
>>>>> leave no pending operation.
>>>>> Here it seems okay because you do it in a controlled situation, but the
>>>>> main thread can also issue blk_drain(), or worse bdrv_drained_begin(),
>>>>> and there would be pending I/O operations when it returns.
>>> Not really.  We would stop in the middle of a trim that processes a list
>>> of discard requests.  So I see it more like stopping in the middle of
>>> anything that processes guest requests.  Once drain ends, we continue
>>> processing them, and that’s not exactly pending I/O.
>>> There is a pending object in s->bus->dma->aiocb on the IDE side, so
>>> there is a pending DMA operation, but naïvely, I don’t see that as a
>>> problem.
>> What about the bdrv_drain_all() when a VM stops, would the guest continue to
>> access memory and disks after bdrv_drain() return?
> That one shouldn't be a problem because the devices are stopped before
> the backends.
>> Migration could also be a problem, because the partial TRIM would not be
>> recorded in the s->bus->error_status field of IDEState (no surprise there,
>> it's not an error).  Also, errors happening after bdrv_drain() might not be
>> migrated correctly.
> Yes, I think it would be good to have the I/O operation fully completed
> on the IDE level rather than just in the block layer.
>>> Or the issue is generally that IDE uses dma_* functions, which might
>>> cause I/O functions to be run from new BHs (I guess through
>>> reschedule_dma()?).
> None of those complicated scenarios actually. The problem solved by the
> request queuing is simply that nothing else stops the guest from
> submitting new requests to drained nodes if the CPUs are still running.
> Drain uses aio_disable_external() to disable processing of external
> events, in particular the ioeventfd used by virtio-blk and virtio-scsi.
> But IDE submits requests through MMIO or port I/O, i.e. the vcpu thread
> exits to userspace and calls directly into the IDE code, so it's
> completely unaffected by aio_disable_external().
>> Ah, you mean that you can have pending I/O operations while blk->in_flight
>> is zero?  That would be a problem indeed.  We already have BlockDevOps for
>> ide-cd and ide-hd, should we add a .drained_poll callback there?
> To be more precise, you suggested in the call that .drained_poll should
> return that IDE is still busy while aiocb != NULL. Without having looked
> at the code in detail yet, that seems to make sense to me. And maybe
> even the blk_inc/dec_in_flight() pair can then go completely away
> because IDE takes care of its drain state itself then.

I assume the addition of drained_poll is meant to be orthogonal to the
fix of the deadlock? At least I can't see how it would help there?

If we have the assumptions:
1. The TRIM operation should be completed on the IDE level before
draining ends.
2. Block layer requests issued after draining has begun are queued.

To me, the conclusion seems to be:
Issue all block layer requests belonging to the IDE TRIM operation up front.

The other alternative I see is to break assumption 2, introduce a way to
not queue certain requests while drained, and use it for the recursive
requests issued by ide_issue_trim_cb. But not the initial one, if that
would defeat the purpose of request queuing. Of course this can't be
done if QEMU relies on the assumption in other places already.

Best Regards,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]