qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iotest 129


From: Vladimir Sementsov-Ogievskiy
Subject: Re: iotest 129
Date: Wed, 13 Jan 2021 12:53:19 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0

12.01.2021 20:44, Max Reitz wrote:
Hi,

tl;dr: I have some troubles debugging what’s wrong with iotest 129.  It wants 
to check that 'stop' does not drain a block job, but to me it seems like that’s 
exactly what’s happening with the mirror job.


For quite some time, I’ve had 129 disabled in my test branch because it fails 
all the time for me.  Now it came up again in Vladimir’s async backup series, 
and so I tried my hands at debugging it once again.

Recap: 129 writes 128M to some image, then runs a block job from there (while 
the source drive is supposedly throttled), then issues a stop command, and 
checks that the job is not drained.  I.e., still running.

(It checks the “running” state via @busy, which is probably wrong; it should 
verify that @state == 'running' (which wasn’t available back in 2015), but 
that’s not really why I’m writing this mail.)

Like the last time I tried
(https://lists.nongnu.org/archive/html/qemu-block/2019-06/msg00499.html) I can 
see that block jobs completely ignore BB throttling: If you apply the 
attachment show-progress.patch, you’ll probably see some progress made while 
the test waits for five seconds.  (Here, on tmpfs, mirror and commit get to 
READY, and backup completes.)

OK, so now that block jobs don’t completely break with filters, you can instead 
use a filter node on the source (as I tried in the patch referenced above).  
And to fully fix the test, we’d also replace the @busy == True check by @status 
== 'running'.  That’s the attachment filter-node-show-progress.patch.

If I apply that, I can see that now there actually is some throttling. After 
the time.sleep(5), mirror and commit get to exactly 1 MB, and backup gets to 
1.0625 MB.  Good.

However, after the stop is issued, backup stays at 1.2 MB (good), but mirror 
and commit progress obviously without throttling to 30, 40, 50 MB, whatever.  
So it appears to me they are drained by the stop.  I.e., precisely what the 
iotest is trying to prove not to happen.

I don't follow.. Increasing of progress means that jobs are drained?



I plan to continue investigating, but I just wanted to send this mail to see 
whether perhaps someone has an idea on what’s going on.

(My main problem is that bisecting this is hard.  AFAIK the throttling applied 
in master:129 has been broken ever since blockdev throttling was moved to the 
BB.  Even without the “Deal with filters” series, you can use at least mirror 
sync=full from filter nodes, so I tried to use filter-node-show-progress.patch 
for just test_drive_mirror(), but that only works back until fe4f0614ef9e361d 
(or rather 0f12264e7a4145 if you prefer not to get a SIGABRT).  Before that 
commit, it hangs.)

Max


Hmm, in show-progress.patch you call "stop" the second time.. It's a mistake I 
think..

Also, on current master x-bps-total I can find only in iotests.. Where is it 
defined o_O? If I change it to be bps-total, it fails.. Strange.

I've run the test with your patch with throttling filter (and a bit more logging).. 
Interesting. It looks like throttling just don't work noramlly after stop.. I see that 
mirror does one 1M request, and it obviously overflow throttle limit, so during your next 
5 seconds it does nothing (and we see progress of 1M). But immediately after 
"stop" command all 16 read requests pending for throttling goes, and then a lot 
more read requests (hmm, exactly 31) are issued and not throttled at all (but goes 
through throttle node). And then new 16 requests are issued and throttled. I've looked at 
throttle_group_co_io_limits_intercept() and I just don't understand how it works)


--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]