[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease i
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback |
Date: |
Mon, 17 Sep 2018 19:08:16 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
On 17/09/2018 18:51, Kevin Wolf wrote:
> Am 17.09.2018 um 17:59 hat Paolo Bonzini geschrieben:
>> On 17/09/2018 14:53, Kevin Wolf wrote:
>>>>> I think I can drop the ref/unref pair, but not the whole patch (whose
>>>>> main point is reordering dec_in_flight vs. the AIO callback).
>>>>
>>>> You're right, though I think I did that on purpose back in the day.
>>>> IIRC it was related to bdrv_drain, which might never complete if called
>>>> from an AIO callback.
>>>
>>> Hm... This seems to become a common pattern, it's the same as for the
>>> job completion callbacks (only improved enough for the bug at hand to
>>> disappear instead of properly fixed in "blockjob: Lie better in
>>> child_job_drained_poll()").
>>>
>>> Either you say there is no activity even though there is still a
>>> callback pending, then bdrv_drain() called from elsewhere will return
>>> too early and we get a bug. Or you say there is activity, then any
>>> nested drain inside that callback will deadlock and we get a bug, too.
>>>
>>> So I suppose we need some way to know which activities to ignore during
>>> drain, depending on who is the caller? :-/
>>
>> Some alternatives:
>>
>> 1) anything that needs to do invoke I/O for callbacks must use inc/dec
>> in flight manually. Simplest but hard to assert though. At least SCSI
>> IDE are broken.
>>
>> 2) callbacks cannot indirectly result in bdrv_drain. Sounds easier
>> since there are not many AIO callers anymore - plus it also seems easier
>> to add debugging code for.
>>
>> 3) we provide device callbacks for "am I busy" and invoke them from
>> bdrv_drain's poll loop.
>>
>> Just off the top of my head. Not sure which is best.
>
> I don't see how 1) and 3) can solve the problem. You still need to
> declare something busy when someone else drains (so the drain doesn't
> return too early) and not busy when it calls a nested drain (so that it
> doesn't deadlock).
>
> 2) would obviously solve the problem, but I'm afraid it's not realistic
> when you consider that block jobs happen _only_ in callbacks. That much
> of this is disguised as coroutine code doesn't change the call chains.
> You also can't use BHs to escape the AIO callback, because then the BH is
> logically part of the operation that needs to be completed for an
> external drain to return, so you would still have to call the job busy
> and would deadlock in a nested drain in that BH. So you're back to
> square one.
But then basically the main issue is mirror.c's call to
bdrv_drained_begin/end. There are no other calls to
bdrv_drained_begin/end inside coroutines IIRC.
Another long-standing idea is to replace aio_disable/enable_external
with implementations of the (currently unused) dev_ops callbacks
.drained_begin and .drained_end. What if jobs used those callbacks to
pause themselves(*), and block/mirror.c had a pause point before the
call to bdrv_drained_begin?
(*) of course block/mirror.c would need some smartness to
not pause itself when the job itself is asking to drain!
Paolo
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, (continued)
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Kevin Wolf, 2018/09/13
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Fam Zheng, 2018/09/14
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Paolo Bonzini, 2018/09/14
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Kevin Wolf, 2018/09/14
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Paolo Bonzini, 2018/09/14
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Kevin Wolf, 2018/09/17
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Paolo Bonzini, 2018/09/17
- Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Kevin Wolf, 2018/09/17
- Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Paolo Bonzini, 2018/09/17
- Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Kevin Wolf, 2018/09/17
- Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback,
Paolo Bonzini <=
- Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Kevin Wolf, 2018/09/18
- Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Paolo Bonzini, 2018/09/18
- Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Kevin Wolf, 2018/09/18
- Re: [Qemu-devel] [Qemu-block] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Paolo Bonzini, 2018/09/19
Re: [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback, Max Reitz, 2018/09/13
[Qemu-devel] [PATCH v2 12/17] mirror: Fix potential use-after-free in active commit, Kevin Wolf, 2018/09/13