qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] vm performance degradation after kvm live migration or save


From: Zhanghaoyu (A)
Subject: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled
Date: Thu, 11 Jul 2013 09:36:47 +0000

hi all,

I met similar problem to these, while performing live migration or save-restore 
test on the kvm platform (qemu:1.4.0, host:suse11sp2, guest:suse11sp2), running 
tele-communication software suite in guest,
https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
https://bugzilla.kernel.org/show_bug.cgi?id=58771

After live migration or virsh restore [savefile], one process's CPU utilization 
went up by about 30%, resulted in throughput degradation of this process.
oprofile report on this process in guest,
pre live migration:
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        app name                 symbol name
248      12.3016  no-vmlinux               (no symbols)
78        3.8690  libc.so.6                memset
68        3.3730  libc.so.6                memcpy
30        1.4881  cscf.scu                 SipMmBufMemAlloc
29        1.4385  libpthread.so.0          pthread_mutex_lock
26        1.2897  cscf.scu                 SipApiGetNextIe
25        1.2401  cscf.scu                 DBFI_DATA_Search
20        0.9921  libpthread.so.0          __pthread_mutex_unlock_usercnt
16        0.7937  cscf.scu                 DLM_FreeSlice
16        0.7937  cscf.scu                 receivemessage
15        0.7440  cscf.scu                 SipSmCopyString
14        0.6944  cscf.scu                 DLM_AllocSlice

post live migration:
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        app name                 symbol name
1586     42.2370  libc.so.6                memcpy
271       7.2170  no-vmlinux               (no symbols)
83        2.2104  libc.so.6                memset
41        1.0919  libpthread.so.0          __pthread_mutex_unlock_usercnt
35        0.9321  cscf.scu                 SipMmBufMemAlloc
29        0.7723  cscf.scu                 DLM_AllocSlice
28        0.7457  libpthread.so.0          pthread_mutex_lock
23        0.6125  cscf.scu                 SipApiGetNextIe
17        0.4527  cscf.scu                 SipSmCopyString
16        0.4261  cscf.scu                 receivemessage
15        0.3995  cscf.scu                 SipcMsgStatHandle
14        0.3728  cscf.scu                 Urilex
12        0.3196  cscf.scu                 DBFI_DATA_Search
12        0.3196  cscf.scu                 SipDsmGetHdrBitValInner
12        0.3196  cscf.scu                 SipSmGetDataFromRefString

So, memcpy costs much more cpu cycles after live migration. Then, I restart the 
process, this problem disappeared. save-restore has the similar problem.

perf report on vcpu thread in host,
pre live migration:
Performance counter stats for thread id '21082':

                 0 page-faults
                 0 minor-faults
                 0 major-faults
             31616 cs
               506 migrations
                 0 alignment-faults
                 0 emulation-faults
        5075957539 L1-dcache-loads                                              
[21.32%]
         324685106 L1-dcache-load-misses     #    6.40% of all L1-dcache hits   
[21.85%]
        3681777120 L1-dcache-stores                                             
[21.65%]
          65251823 L1-dcache-store-misses    # 1.77%                            
       [22.78%]
                 0 L1-dcache-prefetches                                         
[22.84%]
                 0 L1-dcache-prefetch-misses                                    
[22.32%]
        9321652613 L1-icache-loads                                              
[22.60%]
        1353418869 L1-icache-load-misses     #   14.52% of all L1-icache hits   
[21.92%]
         169126969 LLC-loads                                                    
[21.87%]
          12583605 LLC-load-misses           #    7.44% of all LL-cache hits    
[ 5.84%]
         132853447 LLC-stores                                                   
[ 6.61%]
          10601171 LLC-store-misses          #7.9%                              
     [ 5.01%]
          25309497 LLC-prefetches             #30%                              
    [ 4.96%]
           7723198 LLC-prefetch-misses                                          
[ 6.04%]
        4954075817 dTLB-loads                                                   
[11.56%]
          26753106 dTLB-load-misses          #    0.54% of all dTLB cache hits  
[16.80%]
        3553702874 dTLB-stores                                                  
[22.37%]
           4720313 dTLB-store-misses        #0.13%                              
      [21.46%]
     <not counted> dTLB-prefetches
     <not counted> dTLB-prefetch-misses

      60.000920666 seconds time elapsed

post live migration:
Performance counter stats for thread id '1579':

                 0 page-faults                                                  
[100.00%]
                 0 minor-faults                                                 
[100.00%]
                 0 major-faults                                                 
[100.00%]
             34979 cs                                                           
[100.00%]
               441 migrations                                                   
[100.00%]
                 0 alignment-faults                                             
[100.00%]
                 0 emulation-faults
        6903585501 L1-dcache-loads                                              
[22.06%]
         525939560 L1-dcache-load-misses     #    7.62% of all L1-dcache hits   
[21.97%]
        5042552685 L1-dcache-stores                                             
[22.20%]
          94493742 L1-dcache-store-misses    #1.8%                              
     [22.06%]
                 0 L1-dcache-prefetches                                         
[22.39%]
                 0 L1-dcache-prefetch-misses                                    
[22.47%]
       13022953030 L1-icache-loads                                              
[22.25%]
        1957161101 L1-icache-load-misses     #   15.03% of all L1-icache hits   
[22.47%]
         348479792 LLC-loads                                                    
[22.27%]
          80662778 LLC-load-misses           #   23.15% of all LL-cache hits    
[ 5.64%]
         198745620 LLC-stores                                                   
[ 5.63%]
          14236497 LLC-store-misses          #   7.1%                           
         [ 5.41%]
          20757435 LLC-prefetches                                               
[ 5.42%]
           5361819 LLC-prefetch-misses       #   25%                            
    [ 5.69%]
        7235715124 dTLB-loads                                                   
[11.26%]
          49895163 dTLB-load-misses          #    0.69% of all dTLB cache hits  
[16.96%]
        5168276218 dTLB-stores                                                  
[22.44%]
           6765983 dTLB-store-misses        #0.13%                              
      [22.24%]
     <not counted> dTLB-prefetches
     <not counted> dTLB-prefetch-misses

The "LLC-load-misses" went up by about 16%. Then, I restarted the process in 
guest, the perf data back to normal,
Performance counter stats for thread id '1579':

                 0 page-faults                                                  
[100.00%]
                 0 minor-faults                                                 
[100.00%]
                 0 major-faults                                                 
[100.00%]
             30594 cs                                                           
[100.00%]
               327 migrations                                                   
[100.00%]
                 0 alignment-faults                                             
[100.00%]
                 0 emulation-faults
        7707091948 L1-dcache-loads                                              
[22.10%]
         559829176 L1-dcache-load-misses     #    7.26% of all L1-dcache hits   
[22.28%]
        5976654983 L1-dcache-stores                                             
[23.22%]
         160436114 L1-dcache-store-misses                                       
[22.80%]
                 0 L1-dcache-prefetches                                         
[22.51%]
                 0 L1-dcache-prefetch-misses                                    
[22.53%]
       13798415672 L1-icache-loads                                              
[22.28%]
        2017724676 L1-icache-load-misses     #   14.62% of all L1-icache hits   
[22.49%]
         254598008 LLC-loads                                                    
[22.86%]
          16035378 LLC-load-misses           #    6.30% of all LL-cache hits    
[ 5.36%]
         307019606 LLC-stores                                                   
[ 5.60%]
          13665033 LLC-store-misses                                             
[ 5.43%]
          17715554 LLC-prefetches                                               
[ 5.57%]
           4187006 LLC-prefetch-misses                                          
[ 5.44%]
        7811502895 dTLB-loads                                                   
[10.72%]
          40547330 dTLB-load-misses          #    0.52% of all dTLB cache hits  
[16.31%]
        6144202516 dTLB-stores                                                  
[21.58%]
           6313363 dTLB-store-misses                                            
[21.91%]
     <not counted> dTLB-prefetches
     <not counted> dTLB-prefetch-misses

      60.000812523 seconds time elapsed

If EPT disabled, this problem gone.

I suspect that kvm hypervisor has business with this problem.
Based on above suspect, I want to find the two adjacent versions of kvm-kmod 
which triggers this problem or not (e.g. 2.6.39, 3.0-rc1),
and analyze the differences between this two versions, or apply the patches 
between this two versions by bisection method, finally find the key patches.

Any better ideas?

Thanks,
Zhang Haoyu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]