Re: iotest 129

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iotest 129

From:	Max Reitz
Subject:	Re: iotest 129
Date:	Wed, 13 Jan 2021 12:05:28 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0

On 13.01.21 10:53, Vladimir Sementsov-Ogievskiy wrote:

12.01.2021 20:44, Max Reitz wrote:
Hi,
tl;dr: I have some troubles debugging what’s wrong with iotest 129.It wants to check that 'stop' does not drain a block job, but to me itseems like that’s exactly what’s happening with the mirror job.
For quite some time, I’ve had 129 disabled in my test branch becauseit fails all the time for me. Now it came up again in Vladimir’sasync backup series, and so I tried my hands at debugging it once again.
Recap: 129 writes 128M to some image, then runs a block job from there(while the source drive is supposedly throttled), then issues a stopcommand, and checks that the job is not drained. I.e., still running.
(It checks the “running” state via @busy, which is probably wrong; itshould verify that @state == 'running' (which wasn’t available back in2015), but that’s not really why I’m writing this mail.)
Like the last time I tried
(https://lists.nongnu.org/archive/html/qemu-block/2019-06/msg00499.html)I can see that block jobs completely ignore BB throttling: If youapply the attachment show-progress.patch, you’ll probably see someprogress made while the test waits for five seconds. (Here, on tmpfs,mirror and commit get to READY, and backup completes.)
OK, so now that block jobs don’t completely break with filters, youcan instead use a filter node on the source (as I tried in the patchreferenced above). And to fully fix the test, we’d also replace the@busy == True check by @status == 'running'. That’s the attachmentfilter-node-show-progress.patch.
If I apply that, I can see that now there actually is some throttling.After the time.sleep(5), mirror and commit get to exactly 1 MB, andbackup gets to 1.0625 MB. Good.
However, after the stop is issued, backup stays at 1.2 MB (good), butmirror and commit progress obviously without throttling to 30, 40, 50MB, whatever. So it appears to me they are drained by the stop.I.e., precisely what the iotest is trying to prove not to happen.
I don't follow.. Increasing of progress means that jobs are drained?

I don’t know. It does mean that throttling is ignored for a bit,though. I could imagine that’s because something is being drained.

I plan to continue investigating, but I just wanted to send this mailto see whether perhaps someone has an idea on what’s going on.
(My main problem is that bisecting this is hard. AFAIK the throttlingapplied in master:129 has been broken ever since blockdev throttlingwas moved to the BB. Even without the “Deal with filters” series, youcan use at least mirror sync=full from filter nodes, so I tried to usefilter-node-show-progress.patch for just test_drive_mirror(), but thatonly works back until fe4f0614ef9e361d (or rather 0f12264e7a4145 ifyou prefer not to get a SIGABRT). Before that commit, it hangs.)
Max
Hmm, in show-progress.patch you call "stop" the second time.. It's amistake I think..


Ah, oops.  Yes, not sure, how that part got in (some rebasing mistake).

Still, removing those three duplicated lines (stop + query-block-jobs)yields the same result. (I mean, what else is to be expected; BBthrottling does nothing, so even before the first stop, the jobs areREADY/COMPLETED.)

Also, on current master x-bps-total I can find only in iotests.. Whereis it defined o_O? If I change it to be bps-total, it fails.. Strange.


block/throttle-groups.c defines x- as a THROTTLE_OPT_PREFIX... :/

I've run the test with your patch with throttling filter (and a bit morelogging).. Interesting. It looks like throttling just don't worknoramlly after stop.. I see that mirror does one 1M request, and itobviously overflow throttle limit, so during your next 5 seconds it doesnothing (and we see progress of 1M). But immediately after "stop"command all 16 read requests pending for throttling goes, and then a lotmore read requests (hmm, exactly 31) are issued and not throttled at all(but goes through throttle node). And then new 16 requests are issuedand throttled. I've looked at throttle_group_co_io_limits_intercept()and I just don't understand how it works)

Hm. So you’re saying only the current set of requests are drained, butno new ones are generated?

Perhaps 129 was introduced to check that block jobs don’t run tocompletion on 'stop'. The commit before it (e62303a437af72141^) makesblock jobs pause in bdrv_drain_all(), so they don’t generate morerequests. Perhaps we just need to ensure that mirror won’t generatemany concurrent requests.

Indeed. Setting buf_size=65536 leads to offset reaching 64k after thesleep, and then 128k after the stop. That makes sense now.


Now there’s only one problem: That doesn’t work with commit…

Then again, the commit 129 uses is an active commit, i.e. just mirror.It looks like we can translate it into a non-active commit, though thenwe still have no control over the buffer size. But then it onlyprogresses to 2.5 MB, which I guess is fine...

I suppose with your async backup series, we should then limitmax-workers and max-chunk to the minimum for the backup job?

Max

[Prev in Thread]

Current Thread

[Next in Thread]

iotest 129, Max Reitz, 2021/01/12
- Re: iotest 129, Vladimir Sementsov-Ogievskiy, 2021/01/13
  - Re: iotest 129, Max Reitz <=
    - Re: iotest 129, Vladimir Sementsov-Ogievskiy, 2021/01/13
    - Re: iotest 129, Vladimir Sementsov-Ogievskiy, 2021/01/13

Prev by Date: Re: [PATCH v4] block: report errno when flock fcntl fails
Next by Date: Re: [PATCH v6 07/11] iotests: add findtests.py
Previous by thread: Re: iotest 129
Next by thread: Re: iotest 129
Index(es):
- Date
- Thread