qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 09/12] jobs: ensure sleep in job_sleep_ns is fully performed


From: Emanuele Giuseppe Esposito
Subject: Re: [PATCH 09/12] jobs: ensure sleep in job_sleep_ns is fully performed
Date: Thu, 3 Feb 2022 15:21:35 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0


On 26/01/2022 12:21, Stefan Hajnoczi wrote:
> On Tue, Jan 18, 2022 at 11:27:35AM -0500, Emanuele Giuseppe Esposito wrote:
>> If a drain happens while a job is sleeping, the timeout
>> gets cancelled and the job continues once the drain ends.
>> This is especially bad for the sleep performed in commit and stream
>> jobs, since that is dictated by ratelimit to maintain a certain speed.
>>
>> Basically the execution path is the followig:
> 
> s/followig/following/
> 
>> 1. job calls job_sleep_ns, and yield with a timer in @ns ns.
>> 2. meanwhile, a drain is executed, and
>>    child_job_drained_{begin/end} could be executed as ->drained_begin()
>>    and ->drained_end() callbacks.
>>    Therefore child_job_drained_begin() enters the job, that continues
>>    execution in job_sleep_ns() and calls job_pause_point_locked().
>> 3. job_pause_point_locked() detects that we are in the middle of a
>>    drain, and firstly deletes any existing timer and then yields again,
>>    waiting for ->drained_end().
>> 4. Once draining is finished, child_job_drained_end() runs and resumes
>>    the job. At this point, the timer has been lost and we just resume
>>    without checking if enough time has passed.
>>
>> This fix implies that from now onwards, job_sleep_ns will force the job
>> to sleep @ns, even if it is wake up (purposefully or not) in the middle
>> of the sleep. Therefore qemu-iotests test might run a little bit slower,
>> depending on the speed of the job. Setting a job speed to values like "1"
>> is not allowed anymore (unless you want to wait forever).
>>
>> Because of this fix, test_stream_parallel() in tests/qemu-iotests/030
>> takes too long, since speed of stream job is just 1024 and before
>> it was skipping all the wait thanks to the drains. Increase the
>> speed to 256 * 1024. Exactly the same happens for test 151.
>>
>> Instead we need to sleep less in test_cancel_ready() test-blockjob.c,
>> so that the job will be able to exit the sleep and transition to ready
>> before the main loop asserts.
> 
> I remember seeing Hanna and Kevin use carefully rate-limited jobs in
> qemu-iotests. They might have thoughts on whether this patch is
> acceptable or not.
> 

I think the speed was carefully set as "slow enough" just to give time
to the operation to happen while the job was running. Anyways, all tests
I run work as intended, I just increased their speed slightly.
Having speed=1 would make e job really really slow.

Emanuele




reply via email to

[Prev in Thread] Current Thread [Next in Thread]