qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] [Qemu-devel] [Qemu-block] [PATCH v0 0/2] Postponed act


From: John Snow
Subject: Re: [Qemu-stable] [Qemu-devel] [Qemu-block] [PATCH v0 0/2] Postponed actions
Date: Mon, 27 Aug 2018 12:05:37 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0


On 08/27/2018 03:05 AM, Denis Plotnikov wrote:
> PING! PING!
> 

Sorry, Kevin and Stefan are both on PTO right now, I think. I can't
promise I have the time to look soon, but you at least deserve an answer
for the radio silence the last week.

--js

> On 14.08.2018 10:08, Denis Plotnikov wrote:
>>
>>
>> On 13.08.2018 19:30, Kevin Wolf wrote:
>>> Am 13.08.2018 um 10:32 hat Denis Plotnikov geschrieben:
>>>> Ping ping!
>>>>
>>>> On 16.07.2018 21:59, John Snow wrote:
>>>>>
>>>>>
>>>>> On 07/16/2018 11:01 AM, Denis Plotnikov wrote:
>>>>>> Ping!
>>>>>>
>>>>>
>>>>> I never saw a reply to Stefan's question on July 2nd, did you reply
>>>>> off-list?
>>>>>
>>>>> --js
>>>> Yes, I did. I talked to Stefan why the patch set appeared.
>>>
>>> The rest of us still don't know the answer. I had the same question.
>>>
>>> Kevin
>> Yes, that's my fault. I should have post it earlier.
>>
>> I reviewed the problem once again and come up with the following
>> explanation.
>> Indeed, if the global lock has been taken by the main thread the vCPU
>> threads won't be able to execute mmio ide.
>> But, if the main thread will release the lock then nothing will prevent
>> vCPU threads form execution what they want, e.g writing to the block
>> device.
>>
>> In case of running the mirroring it is possible. Let's take a look
>> at the following snippet of mirror_run. This is a part the mirroring
>> completion part.
>>
>>              bdrv_drained_begin(bs);
>>              cnt = bdrv_get_dirty_count(s->dirty_bitmap);
>>  >>>>>>      if (cnt > 0 || mirror_flush(s) < 0) {
>>                  bdrv_drained_end(bs);
>>                  continue;
>>              }
>>
>> (X) >>>>    assert(QLIST_EMPTY(&bs->tracked_requests));
>>
>> mirror_flush here can yield the current coroutine so nothing more can
>> be executed.
>> We could end up with the situation when the main loop have to revolve
>> to poll for another timer/bh to process. While revolving it releases
>> the global lock. If the global lock is waited for by a vCPU (any
>> other) thread, the waiting thread will get the lock and make what it
>> intends.
>>
>> This is something that I can observe:
>>
>> mirror_flush yields coroutine, the main thread revolves and locks
>> because a vCPU was waiting for the lock. Now the vCPU thread owns the
>> lock and the main thread waits for the lock releasing.
>> The vCPU thread does cmd_write_dma and releases the lock. Then, the main
>> thread gets the lock and continues to run eventually proceeding with
>> the coroutine yeiled.
>> If the vCPU requests aren't completed by the moment we will assert at
>> (X). If the vCPU requests are completed we won't even notice that we
>> had some writes while in the drained section.
>>
>> Denis
>>>
>>>>>> On 29.06.2018 15:40, Denis Plotnikov wrote:
>>>>>>> There are cases when a request to a block driver state shouldn't
>>>>>>> have
>>>>>>> appeared producing dangerous race conditions.
>>>>>>> This misbehaviour is usually happens with storage devices emulated
>>>>>>> without eventfd for guest to host notifications like IDE.
>>>>>>>
>>>>>>> The issue arises when the context is in the "drained" section
>>>>>>> and doesn't expect the request to come, but request comes from the
>>>>>>> device not using iothread and which context is processed by the main
>>>>>>> loop.
>>>>>>>
>>>>>>> The main loop apart of the iothread event loop isn't blocked by the
>>>>>>> "drained" section.
>>>>>>> The request coming and processing while in "drained" section can
>>>>>>> spoil
>>>>>>> the
>>>>>>> block driver state consistency.
>>>>>>>
>>>>>>> This behavior can be observed in the following KVM-based case:
>>>>>>>
>>>>>>> 1. Setup a VM with an IDE disk.
>>>>>>> 2. Inside a VM start a disk writing load for the IDE device
>>>>>>>      e.g: dd if=<file> of=<file> bs=X count=Y oflag=direct
>>>>>>> 3. On the host create a mirroring block job for the IDE device
>>>>>>>      e.g: drive_mirror <your_IDE> <your_path>
>>>>>>> 4. On the host finish the block job
>>>>>>>      e.g: block_job_complete <your_IDE>
>>>>>>>     Having done the 4th action, you could get an assert:
>>>>>>> assert(QLIST_EMPTY(&bs->tracked_requests)) from mirror_run.
>>>>>>> On my setup, the assert is 1/3 reproducible.
>>>>>>>
>>>>>>> The patch series introduces the mechanism to postpone the requests
>>>>>>> until the BDS leaves "drained" section for the devices not using
>>>>>>> iothreads.
>>>>>>> Also, it modifies the asynchronous block backend infrastructure
>>>>>>> to use
>>>>>>> that mechanism to release the assert bug for IDE devices.
>>>>>>>
>>>>>>> Denis Plotnikov (2):
>>>>>>>      async: add infrastructure for postponed actions
>>>>>>>      block: postpone the coroutine executing if the BDS's is drained
>>>>>>>
>>>>>>>     block/block-backend.c | 58
>>>>>>> ++++++++++++++++++++++++++++++---------
>>>>>>>     include/block/aio.h   | 63
>>>>>>> +++++++++++++++++++++++++++++++++++++++++++
>>>>>>>     util/async.c          | 33 +++++++++++++++++++++++
>>>>>>>     3 files changed, 142 insertions(+), 12 deletions(-)
>>>>>>>
>>>>>>
>>>>
>>>> -- 
>>>> Best,
>>>> Denis
>>
> 

-- 
—js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]