qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Why qemu write/rw speed is so low?


From: Zhi Yong Wu
Subject: Re: [Qemu-devel] Why qemu write/rw speed is so low?
Date: Tue, 13 Sep 2011 10:52:44 +0800

This is real log when fio issued with bs=128K and bps=1000000(block
I/O throttling):

  8,2    0        1     0.000000000 24332  A  WS 79958528 + 256 <-
(253,2) 71830016
  8,0    0        2     0.000000912 24332  A  WS 80984576 + 256 <-
(8,2) 79958528
  8,2    0        3     0.000001778 24332  Q  WS 80984576 + 256
[qemu-system-x86]
  8,2    0        4     0.000006527 24332  G  WS 80984576 + 256
[qemu-system-x86]
  8,2    0        5     0.000007817 24332  P   N [qemu-system-x86]
  8,2    0        6     0.000011234 24332  I  WS 80984576 + 256
[qemu-system-x86]
CPU0 (8,2):
 Reads Queued:           0,        0KiB  Writes Queued:         558,   25,244KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:      265,   25,440KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:    1,027,   56,420KiB
 Read Merges:            0,        0KiB  Write Merges:           19,       76KiB
 Read depth:             0               Write depth:             3
 IO unplugs:           217               Timer unplugs:         268
CPU1 (8,2):
 Reads Queued:           0,        0KiB  Writes Queued:         483,   31,176KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:      262,   30,980KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:           20,       80KiB
 Read depth:             0               Write depth:             3
 IO unplugs:           265               Timer unplugs:         181

Total (8,2):
 Reads Queued:           0,        0KiB  Writes Queued:       1,041,   56,420KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:      527,   56,420KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:    1,027,   56,420KiB
 Read Merges:            0,        0KiB  Write Merges:           39,      156KiB
 IO unplugs:           482               Timer unplugs:         449

Throughput (R/W): 0KiB/s / 482KiB/s
Events (8,2): 17,661 entries
Skips: 1,820 forward (3,918,005 -  99.6%)

I found that I/O pattern host issues each time is different when fio
is issued with different bs value.

On Tue, Sep 13, 2011 at 10:38 AM, Zhi Yong Wu <address@hidden> wrote:
> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
> <address@hidden> wrote:
>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>>> Today, i did some basical I/O testing, and suddenly found that qemu write 
>>> and rw speed is so low now, my qemu binary is built on commit 
>>> 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>>
>>> Do qemu have regression?
>>>
>>> The testing data is shown as below:
>>>
>>> 1.) write
>>>
>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>>
>> Please post your QEMU command-line.  If your -drive is using
>> cache=writethrough then small writes are slow because they require the
>> physical disk to write and then synchronize its write cache.  Typically
>> cache=none is a good setting to use for local disks.
>>
>> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>> so I think a 512 byte write from the guest could cause a 4 KB
>> read-modify-write operation on the host filesystem.
>>
>> You can check this by running btrace(8) on the host during the
>> benchmark.  The blktrace output and the summary statistics will show
>> what I/O pattern the host is issuing.
>  8,2    0        1     0.000000000   337  A  WS 425081504 + 8 <-
> (253,1) 42611360
>  8,0    0        2     0.000000896   337  A  WS 426107552 + 8 <-
> (8,2) 425081504
>  8,2    0        3     0.000001772   337  Q  WS 426107552 + 8 [jbd2/dm-1-8]
>  8,2    0        4     0.000006617   337  G  WS 426107552 + 8 [jbd2/dm-1-8]
>  8,2    0        5     0.000007862   337  P   N [jbd2/dm-1-8]
>  8,2    0        6     0.000010481   337  I  WS 426107552 + 8 [jbd2/dm-1-8]
> .....
> CPU0 (8,2):
>  Reads Queued:          11,      416KiB  Writes Queued:          20,       
> 72KiB
>  Read Dispatches:       12,      440KiB  Write Dispatches:        8,       
> 72KiB
>  Reads Requeued:         0               Writes Requeued:         0
>  Reads Completed:       14,      448KiB  Writes Completed:       12,       
> 72KiB
>  Read Merges:            0,        0KiB  Write Merges:           10,       
> 40KiB
>  Read depth:             2               Write depth:             2
>  IO unplugs:            11               Timer unplugs:           0
> CPU1 (8,2):
>  Reads Queued:           8,       32KiB  Writes Queued:           0,        
> 0KiB
>  Read Dispatches:        2,        8KiB  Write Dispatches:        0,        
> 0KiB
>  Reads Requeued:         0               Writes Requeued:         0
>  Reads Completed:        0,        0KiB  Writes Completed:        0,        
> 0KiB
>  Read Merges:            5,       20KiB  Write Merges:            0,        
> 0KiB
>  Read depth:             2               Write depth:             2
>  IO unplugs:             0               Timer unplugs:           0
>
> Total (8,2):
>  Reads Queued:          19,      448KiB  Writes Queued:          20,       
> 72KiB
>  Read Dispatches:       14,      448KiB  Write Dispatches:        8,       
> 72KiB
>  Reads Requeued:         0               Writes Requeued:         0
>  Reads Completed:       14,      448KiB  Writes Completed:       12,       
> 72KiB
>  Read Merges:            5,       20KiB  Write Merges:           10,       
> 40KiB
>  IO unplugs:            11               Timer unplugs:           0
>
> Throughput (R/W): 69KiB/s / 11KiB/s
> Events (8,2): 411 entries
> Skips: 50 forward (5,937 -  93.5%)
>
> From its log, host will write 8 blocks each time. what is each block's size?
>
>>
>> I suggest changing your fio block size to 8 KB if you want to try a
>> small block size.  If you want a large block size, try 64 KB or 128 KB.
> When -drive if=virtio,cache=none,file=xxx,bps=1000000  is set
> (note that bps is in bytes).
>
> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
> Starting 1 process
> ^Cbs: 1 (f=1): [W] [1.9% done] [0K/61K /s] [0/120 iops] [eta 01h:00m:57s]
> fio: terminating on signal 2
>
> test: (groupid=0, jobs=1): err= 0: pid=27515
>  write: io=3,960KB, bw=56,775B/s, iops=110, runt= 71422msec
>    slat (usec): min=19, max=31,032, avg=65.03, stdev=844.57
>    clat (msec): min=1, max=353, avg= 8.93, stdev=11.91
>     lat (msec): min=1, max=353, avg= 8.99, stdev=12.00
>    bw (KB/s) : min=    2, max=   60, per=102.06%, avg=56.14, stdev=10.89
>  cpu          : usr=0.04%, sys=0.61%, ctx=7936, majf=0, minf=26
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/7920, short=0/0
>
>     lat (msec): 2=0.03%, 4=0.09%, 10=98.48%, 20=0.54%, 50=0.67%
>     lat (msec): 100=0.03%, 250=0.05%, 500=0.11%
>
> Run status group 0 (all jobs):
>  WRITE: io=3,960KB, aggrb=55KB/s, minb=56KB/s, maxb=56KB/s,
> mint=71422msec, maxt=71422msec
>
> Disk stats (read/write):
>  dm-0: ios=6/8007, merge=0/0, ticks=179/78114, in_queue=78272,
> util=99.58%, aggrios=4/7975, aggrmerge=0/44, aggrticks=179/75153,
> aggrin_queue=75304, aggrutil=99.53%
>    vda: ios=4/7975, merge=0/44, ticks=179/75153, in_queue=75304, util=99.53%
> test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0K/752K /s] [0/91 iops] [eta 00m:00s]
> test: (groupid=0, jobs=1): err= 0: pid=27526
>  write: io=51,200KB, bw=668KB/s, iops=83, runt= 76622msec
>    slat (usec): min=20, max=570K, avg=386.16, stdev=11400.53
>    clat (msec): min=1, max=1,699, avg=11.57, stdev=29.85
>     lat (msec): min=1, max=1,699, avg=11.96, stdev=33.24
>    bw (KB/s) : min=   20, max=  968, per=104.93%, avg=700.95, stdev=245.18
>  cpu          : usr=0.08%, sys=0.41%, ctx=6418, majf=0, minf=25
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/6400, short=0/0
>
>     lat (msec): 2=0.05%, 4=0.09%, 10=94.08%, 20=0.78%, 50=3.39%
>     lat (msec): 100=0.33%, 250=1.12%, 500=0.11%, 750=0.03%, 2000=0.02%
>
> Run status group 0 (all jobs):
>  WRITE: io=51,200KB, aggrb=668KB/s, minb=684KB/s, maxb=684KB/s,
> mint=76622msec, maxt=76622msec
>
> Disk stats (read/write):
>  dm-0: ios=913/6731, merge=0/0, ticks=161809/696060, in_queue=858086,
> util=100.00%, aggrios=1070/6679, aggrmerge=316/410,
> aggrticks=163975/1587245, aggrin_queue=1751267, aggrutil=100.00%
>    vda: ios=1070/6679, merge=316/410, ticks=163975/1587245,
> in_queue=1751267, util=100.00%
> test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0K/458K /s] [0/6 iops] [eta 00m:00s]
> test: (groupid=0, jobs=1): err= 0: pid=27614
>  write: io=51,200KB, bw=428KB/s, iops=6, runt=119618msec
>    slat (usec): min=28, max=5,507K, avg=7618.90, stdev=194782.69
>    clat (msec): min=14, max=9,418, avg=140.49, stdev=328.96
>     lat (msec): min=14, max=9,418, avg=148.11, stdev=382.21
>    bw (KB/s) : min=   11, max=  664, per=114.06%, avg=488.19, stdev=53.97
>  cpu          : usr=0.03%, sys=0.04%, ctx=825, majf=0, minf=27
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/800, short=0/0
>
>     lat (msec): 20=0.38%, 50=2.50%, 250=97.00%, >=2000=0.12%
>
> Run status group 0 (all jobs):
>  WRITE: io=51,200KB, aggrb=428KB/s, minb=438KB/s, maxb=438KB/s,
> mint=119618msec, maxt=119618msec
>
> Disk stats (read/write):
>  dm-0: ios=0/938, merge=0/0, ticks=0/517622, in_queue=517712,
> util=100.00%, aggrios=0/883, aggrmerge=0/55, aggrticks=0/351498,
> aggrin_queue=351497, aggrutil=100.00%
>    vda: ios=0/883, merge=0/55, ticks=0/351498, in_queue=351497, util=100.00%
> test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0K/374K /s] [0/2 iops] [eta 00m:00s]
> test: (groupid=0, jobs=1): err= 0: pid=27623
>  write: io=51,200KB, bw=422KB/s, iops=3, runt=121420msec
>    slat (usec): min=29, max=5,484K, avg=17456.88, stdev=274747.99
>    clat (msec): min=174, max=9,559, avg=283.02, stdev=465.37
>     lat (msec): min=176, max=9,559, avg=300.47, stdev=538.58
>    bw (KB/s) : min=   22, max=  552, per=114.21%, avg=480.81, stdev=49.10
>  cpu          : usr=0.00%, sys=0.03%, ctx=425, majf=0, minf=27
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/400, short=0/0
>
>     lat (msec): 250=9.50%, 500=90.25%, >=2000=0.25%
>
> Run status group 0 (all jobs):
>  WRITE: io=51,200KB, aggrb=421KB/s, minb=431KB/s, maxb=431KB/s,
> mint=121420msec, maxt=121420msec
>
> Disk stats (read/write):
>  dm-0: ios=0/541, merge=0/0, ticks=0/573761, in_queue=574004,
> util=100.00%, aggrios=0/484, aggrmerge=0/57, aggrticks=0/396662,
> aggrin_queue=396662, aggrutil=100.00%
>    vda: ios=0/484, merge=0/57, ticks=0/396662, in_queue=396662, util=100.00%
>
>
>>
>> Stefan
>>
>>
>
>
>
> --
> Regards,
>
> Zhi Yong Wu
>



-- 
Regards,

Zhi Yong Wu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]