qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty


From: Chunguang Li
Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
Date: Sat, 8 Oct 2016 15:55:58 +0800 (GMT+08:00)



> -----原始邮件-----
> 发件人: "Amit Shah" <address@hidden>
> 发送时间: 2016年9月30日 星期五
> 收件人: "Chunguang Li" <address@hidden>
> 抄送: "Dr. David Alan Gilbert" <address@hidden>, address@hidden, 
> address@hidden, address@hidden, address@hidden
> 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as 
> dirty after they have been sent
> 
> On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > 
> > 
> > 
> > > -----原始邮件-----
> > > 发件人: "Dr. David Alan Gilbert" <address@hidden>
> > > 发送时间: 2016年9月26日 星期一
> > > 收件人: "Chunguang Li" <address@hidden>
> > > 抄送: address@hidden, address@hidden, address@hidden, address@hidden, 
> > > address@hidden
> > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as 
> > > dirty after they have been sent
> > > 
> > > * Chunguang Li (address@hidden) wrote:
> > > > Hi all!
> > > > I have some confusion about the dirty bitmap during migration. I have 
> > > > digged into the code. I figure out that every now and then during 
> > > > migration, the dirty bitmap will be grabbed from the kernel space 
> > > > through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's 
> > > > dirty bitmap. However I think this mechanism leads to resendness of 
> > > > some NON-dirty pages.
> > > > 
> > > > Take the first iteration of precopy for instance, during which all the 
> > > > pages will be sent. Before that during the migration setup, the 
> > > > ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to 
> > > > produce the dirty bitmap from this moment. When the pages "that haven't 
> > > > been sent" are written, the kernel space marks them as dirty. However I 
> > > > don't think this is correct, because these pages will be sent during 
> > > > this and the next iterations with the same content (if they are not 
> > > > written again after they are sent). It only makes sense to mark the 
> > > > pages which have already been sent during one iteration as dirty when 
> > > > they are written.
> > > > 
> > > > 
> > > > Am I right about this consideration? If I am right, is there some 
> > > > advice to improve this?
> > > 
> > > I think you're right that this can happen; to clarify I think the
> > > case you're talking about is:
> > > 
> > >   Iteration 1
> > >     sync bitmap
> > >     start sending pages
> > >     page 'n' is modified - but hasn't been sent yet
> > >     page 'n' gets sent
> > >   Iteration 2
> > >     sync bitmap
> > >        'page n is shown as modified'
> > >     send page 'n' again
> > >
> > 
> > Yes,this is right the case I am talking about.
> >  
> > > So you're right that is wasteful; I guess it's more wasteful
> > > on big VMs with slow networks where the length of each iteration
> > > is large.
> > 
> > I think this is "very" wasteful. Assume the workload writes the pages dirty 
> > randomly within the guest address space, and the transfer speed is 
> > constant. Intuitively, I think nearly half of the dirty pages produced in 
> > Iteration 1 is not really dirty. This means the time of Iteration 2 is 
> > double of that to send only really dirty pages.
> 
> It makes sense, can you get some perf numbers to show what kinds of
> workloads get impacted the most?  That would also help us to figure
> out what kinds of speed improvements we can expect.
> 
> 
>               Amit

I have picked up 6 workloads and got the following statistics numbers 
of every iteration (except the last stop-copy one) during precopy.
These numbers are obtained with the basic precopy migration, without 
the capabilities like xbzrle or compression, etc. The network for the 
migration is exclusive, with a separate network for the workloads. 
They are both gigabit ethernet. I use qemu-2.5.1.

Three (booting, idle, web server) of them converged to the stop-copy phase, 
with the given bandwidth and default downtime (300ms), while the other
three (kernel compilation, zeusmp, memcached) did not.

One page is "not-really-dirty", if it is written first and is sent later
(and not written again after that) during one iteration. I guess this 
would not happen so often during the other iterations as during the 1st 
iteration. Because all the pages of the VM are sent to the dest node during 
the 1st iteration, while during the others, only part of the pages are sent. 
So I think the "not-really-dirty" pages should be produced mainly during 
the 1st iteration , and maybe very little during the other iterations.

If we could avoid resending the "not-really-dirty" pages, intuitively, I
think the time spent on Iteration 2 would be halved. This is a chain reaction,
because the dirty pages produced during Iteration 2 is halved, which incurs
that the time spent on Iteration 3 is halved, then Iteration 4, 5...

So I think "booting" and  "kernel compilation" should benefit a lot from this
improvement. The reason of "kernel compilation" would benefit is that some 
iterations take around 600ms, and if they are halved into 300ms, then the 
precopy
may have the chance to step into stop and copy phase.

On the other hand, "idle" and "web server" would not benefit a lot, because
most of the time are spent on the 1st iteration and little on the others.

As to the "zeusmp" and "memcached", although the time spent on the other 
iterations
but the 1st one may be halved, they still could not converge to stop and copy 
with the 300ms downtime.

--------------------1 vcpu, 1 GB ram, default bandwidth 
(32MB/s):------------------

1. booting : begin to migrate when the VM is booting

Iteration   1, duration:   6997 ms , transferred pages:   266450 (n:    57269, 
d:   209181 ) , new dirty pages:    56414 , remaining dirty pages:    56414
Iteration   2, duration:   6497 ms , transferred pages:    54008 (n:    52701, 
d:     1307 ) , new dirty pages:    48053 , remaining dirty pages:    50459
Iteration   3, duration:   5800 ms , transferred pages:    48232 (n:    47444, 
d:      788 ) , new dirty pages:     9129 , remaining dirty pages:    11356
Iteration   4, duration:   1100 ms , transferred pages:     9091 (n:     8998, 
d:       93 ) , new dirty pages:      165 , remaining dirty pages:     2430
Iteration   5, duration:      1 ms , transferred pages:        0 (n:        0, 
d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2430
(note: When the workload does converge, the output of the last iteration is 
"fake". It just indicates that the precopy steps into stop-copy phase now.
       "n" means "normal pages" and "d" means "duplicate (zero) pages".)

2. idle

Iteration   1, duration:  14496 ms , transferred pages:   266450 (n:   118980, 
d:   147470 ) , new dirty pages:    17398 , remaining dirty pages:    17398
Iteration   2, duration:   1896 ms , transferred pages:    14953 (n:    14854, 
d:       99 ) , new dirty pages:     1849 , remaining dirty pages:     4294
Iteration   3, duration:    300 ms , transferred pages:     2454 (n:     2454, 
d:        0 ) , new dirty pages:        9 , remaining dirty pages:     1849
Iteration   4, duration:      1 ms , transferred pages:        0 (n:        0, 
d:        0 ) , new dirty pages:        0 , remaining dirty pages:     1849

3. kernel compilation (can not converge)

Iteration   1, duration:  20700 ms , transferred pages:   266450 (n:   169778, 
d:    96672 ) , new dirty pages:    40067 , remaining dirty pages:    40067
Iteration   2, duration:   4696 ms , transferred pages:    38401 (n:    37787, 
d:      614 ) , new dirty pages:     8852 , remaining dirty pages:    10518
Iteration   3, duration:   1000 ms , transferred pages:     8642 (n:     8180, 
d:      462 ) , new dirty pages:     6331 , remaining dirty pages:     8207
Iteration   4, duration:    700 ms , transferred pages:     6110 (n:     5726, 
d:      384 ) , new dirty pages:     5242 , remaining dirty pages:     7339
Iteration   5, duration:    600 ms , transferred pages:     5007 (n:     4908, 
d:       99 ) , new dirty pages:     4868 , remaining dirty pages:     7200
Iteration   6, duration:    600 ms , transferred pages:     5226 (n:     4908, 
d:      318 ) , new dirty pages:     6142 , remaining dirty pages:     8116
Iteration   7, duration:    700 ms , transferred pages:     5985 (n:     5726, 
d:      259 ) , new dirty pages:     5902 , remaining dirty pages:     8033
Iteration   8, duration:    701 ms , transferred pages:     5893 (n:     5726, 
d:      167 ) , new dirty pages:     7502 , remaining dirty pages:     9642
Iteration   9, duration:    900 ms , transferred pages:     7623 (n:     7362, 
d:      261 ) , new dirty pages:     6408 , remaining dirty pages:     8427
Iteration  10, duration:    700 ms , transferred pages:     6008 (n:     5726, 
d:      282 ) , new dirty pages:     8312 , remaining dirty pages:    10731
Iteration  11, duration:   1000 ms , transferred pages:     8353 (n:     8180, 
d:      173 ) , new dirty pages:     6874 , remaining dirty pages:     9252
Iteration  12, duration:    899 ms , transferred pages:     7477 (n:     7362, 
d:      115 ) , new dirty pages:     5573 , remaining dirty pages:     7348
Iteration  13, duration:    601 ms , transferred pages:     5099 (n:     4908, 
d:      191 ) , new dirty pages:     7671 , remaining dirty pages:     9920
Iteration  14, duration:    900 ms , transferred pages:     7586 (n:     7362, 
d:      224 ) , new dirty pages:     7359 , remaining dirty pages:     9693
Iteration  15, duration:    900 ms , transferred pages:     7682 (n:     7362, 
d:      320 ) , new dirty pages:     7371 , remaining dirty pages:     9382

4. cpu2006.zeusmp (can not converge)

Iteration   1, duration:  21603 ms , transferred pages:   266450 (n:   176660, 
d:    89790 ) , new dirty pages:   145625 , remaining dirty pages:   145625
Iteration   2, duration:   8696 ms , transferred pages:   144389 (n:    70862, 
d:    73527 ) , new dirty pages:   125124 , remaining dirty pages:   126360
Iteration   3, duration:   6301 ms , transferred pages:   124057 (n:    51379, 
d:    72678 ) , new dirty pages:   122528 , remaining dirty pages:   124831
Iteration   4, duration:   6400 ms , transferred pages:   124330 (n:    52196, 
d:    72134 ) , new dirty pages:   124267 , remaining dirty pages:   124768
Iteration   5, duration:   6703 ms , transferred pages:   124034 (n:    54656, 
d:    69378 ) , new dirty pages:   124151 , remaining dirty pages:   124885
Iteration   6, duration:   6703 ms , transferred pages:   124357 (n:    54658, 
d:    69699 ) , new dirty pages:   124106 , remaining dirty pages:   124634
Iteration   7, duration:   6602 ms , transferred pages:   124568 (n:    53838, 
d:    70730 ) , new dirty pages:   133828 , remaining dirty pages:   133894
Iteration   8, duration:   7600 ms , transferred pages:   133030 (n:    62021, 
d:    71009 ) , new dirty pages:   126612 , remaining dirty pages:   127476
Iteration   9, duration:   7299 ms , transferred pages:   126511 (n:    59569, 
d:    66942 ) , new dirty pages:   122727 , remaining dirty pages:   123692
Iteration  10, duration:   6609 ms , transferred pages:   123692 (n:    54539, 
d:    69153 ) , new dirty pages:   122727 , remaining dirty pages:   122727
Iteration  11, duration:   6995 ms , transferred pages:   120347 (n:    56423, 
d:    63924 ) , new dirty pages:   121430 , remaining dirty pages:   123810
Iteration  12, duration:   6703 ms , transferred pages:   123040 (n:    54657, 
d:    68383 ) , new dirty pages:   122043 , remaining dirty pages:   122813
Iteration  13, duration:   7006 ms , transferred pages:   122353 (n:    57121, 
d:    65232 ) , new dirty pages:   133869 , remaining dirty pages:   134329
Iteration  14, duration:   8209 ms , transferred pages:   132325 (n:    66932, 
d:    65393 ) , new dirty pages:   126914 , remaining dirty pages:   128918
Iteration  15, duration:   7802 ms , transferred pages:   126931 (n:    63671, 
d:    63260 ) , new dirty pages:   122351 , remaining dirty pages:   124338

5. web server : An apache web server. The client is configured with 50 
concurrent connections.

Iteration   1, duration:  30697 ms , transferred pages:   266450 (n:   251215, 
d:    15235 ) , new dirty pages:    30628 , remaining dirty pages:    30628
Iteration   2, duration:   3496 ms , transferred pages:    28859 (n:    28513, 
d:      346 ) , new dirty pages:     5805 , remaining dirty pages:     7574
Iteration   3, duration:    701 ms , transferred pages:     5746 (n:     5726, 
d:       20 ) , new dirty pages:     3433 , remaining dirty pages:     5261
Iteration   4, duration:    400 ms , transferred pages:     3281 (n:     3272, 
d:        9 ) , new dirty pages:     1539 , remaining dirty pages:     3519
Iteration   5, duration:    199 ms , transferred pages:     1653 (n:     1636, 
d:       17 ) , new dirty pages:      301 , remaining dirty pages:     2167
Iteration   6, duration:      1 ms , transferred pages:        0 (n:        0, 
d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2167

--------------------6 vcpu, 6 GB ram, max bandwidth (941.08 
mbps):------------------

6. memcached : 4 GB cache, memaslap: all write, concurrency = 5  (can not 
converge)

Iteration   1, duration:  42486 ms , transferred pages:  1568087 (n:  1216079, 
d:   352008 ) , new dirty pages:   571940 , remaining dirty pages:   581023
Iteration   2, duration:  19774 ms , transferred pages:   571700 (n:   567416, 
d:     4284 ) , new dirty pages:   331690 , remaining dirty pages:   341013
Iteration   3, duration:  11589 ms , transferred pages:   332187 (n:   332095, 
d:       92 ) , new dirty pages:   222725 , remaining dirty pages:   231551
Iteration   4, duration:   7790 ms , transferred pages:   223571 (n:   223499, 
d:       72 ) , new dirty pages:   157658 , remaining dirty pages:   165638
Iteration   5, duration:   5518 ms , transferred pages:   158056 (n:   157998, 
d:       58 ) , new dirty pages:   128130 , remaining dirty pages:   135712
Iteration   6, duration:   4442 ms , transferred pages:   127764 (n:   127701, 
d:       63 ) , new dirty pages:   104839 , remaining dirty pages:   112787
Iteration   7, duration:   3649 ms , transferred pages:   104581 (n:   104523, 
d:       58 ) , new dirty pages:   100736 , remaining dirty pages:   108942
Iteration   8, duration:   3532 ms , transferred pages:   101379 (n:   101315, 
d:       64 ) , new dirty pages:    87869 , remaining dirty pages:    95432
Iteration   9, duration:   3030 ms , transferred pages:    86841 (n:    86786, 
d:       55 ) , new dirty pages:    77505 , remaining dirty pages:    86096
Iteration  10, duration:   2709 ms , transferred pages:    77875 (n:    77814, 
d:       61 ) , new dirty pages:    77197 , remaining dirty pages:    85418
Iteration  11, duration:   2696 ms , transferred pages:    77107 (n:    77044, 
d:       63 ) , new dirty pages:    65010 , remaining dirty pages:    73321
Iteration  12, duration:   2308 ms , transferred pages:    66540 (n:    66484, 
d:       56 ) , new dirty pages:    64388 , remaining dirty pages:    71169
Iteration  13, duration:   2198 ms , transferred pages:    62953 (n:    62897, 
d:       56 ) , new dirty pages:    62773 , remaining dirty pages:    70989
Iteration  14, duration:   2214 ms , transferred pages:    63466 (n:    63411, 
d:       55 ) , new dirty pages:    67538 , remaining dirty pages:    75061
Iteration  15, duration:   2329 ms , transferred pages:    66924 (n:    66875, 
d:       49 ) , new dirty pages:    63580 , remaining dirty pages:    71717
Iteration  16, duration:   2252 ms , transferred pages:    64554 (n:    64539, 
d:       15 ) , new dirty pages:    63094 , remaining dirty pages:    70257
Iteration  17, duration:   2188 ms , transferred pages:    62697 (n:    62641, 
d:       56 ) , new dirty pages:    63016 , remaining dirty pages:    70576
Iteration  18, duration:   2171 ms , transferred pages:    62377 (n:    62322, 
d:       55 ) , new dirty pages:    56764 , remaining dirty pages:    64963
Iteration  19, duration:   2003 ms , transferred pages:    57382 (n:    57324, 
d:       58 ) , new dirty pages:    65307 , remaining dirty pages:    72888
Iteration  20, duration:   2240 ms , transferred pages:    64426 (n:    64364, 
d:       62 ) , new dirty pages:    61585 , remaining dirty pages:    70047


--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China






reply via email to

[Prev in Thread] Current Thread [Next in Thread]