|
From: | Jay Zhou |
Subject: | Re: [Qemu-devel] About QEMU BQL and dirty log switch in Migration |
Date: | Wed, 17 May 2017 15:35:51 +0800 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 2017/5/17 13:47, Wanpeng Li wrote:
Hi Zhoujian, 2017-05-17 10:20 GMT+08:00 Zhoujian (jay) <address@hidden>:Hi Wanpeng,On 11/05/2017 14:07, Zhoujian (jay) wrote:- * Scan sptes if dirty logging has been stopped, dropping those - * which can be collapsed into a single large-page spte. Later - * page faults will create the large-page sptes. + * Reset each vcpu's mmu, then page faults will create thelarge-page+ * sptes later. */ if ((change != KVM_MR_DELETE) && (old->flags & KVM_MEM_LOG_DIRTY_PAGES) && - !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) - kvm_mmu_zap_collapsible_sptes(kvm, new);This is an unlikely branch(unless guest live migration fails and continue to run on the source machine) instead of hot path, do you have any performance number for your real workloads?Sorry to bother you again. Recently, I have tested the performance before migration and after migration failure using spec cpu2006 https://www.spec.org/cpu2006/, which is a standard performance evaluation tool. These are the results: ****** Before migration the score is 153, and the TLB miss statistics of the qemu process is: linux-sjrfac:/mnt/zhoujian # perf stat -e dTLB-load-misses,dTLB-loads,dTLB-store-misses, \ dTLB-stores,iTLB-load-misses,iTLB-loads -p 26463 sleep 10 Performance counter stats for process id '26463': 698,938 dTLB-load-misses # 0.13% of all dTLB cache hits (50.46%) 543,303,875 dTLB-loads (50.43%) 199,597 dTLB-store-misses (16.51%) 60,128,561 dTLB-stores (16.67%) 69,986 iTLB-load-misses # 6.17% of all iTLB cache hits (16.67%) 1,134,097 iTLB-loads (33.33%) 10.000684064 seconds time elapsed After migration failure the score is 149, and the TLB miss statistics of the qemu process is: linux-sjrfac:/mnt/zhoujian # perf stat -e dTLB-load-misses,dTLB-loads,dTLB-store-misses, \ dTLB-stores,iTLB-load-misses,iTLB-loads -p 26463 sleep 10 Performance counter stats for process id '26463': 765,400 dTLB-load-misses # 0.14% of all dTLB cache hits (50.50%) 540,972,144 dTLB-loads (50.47%) 207,670 dTLB-store-misses (16.50%) 58,363,787 dTLB-stores (16.67%) 109,772 iTLB-load-misses # 9.52% of all iTLB cache hits (16.67%) 1,152,784 iTLB-loads (33.32%) 10.000703078 seconds time elapsed ******Could you comment out the original "lazy collapse small sptes into large sptes" codes in the function kvm_arch_commit_memory_region() and post the results here?
With the patch below, diff --git a/source/x86/x86.c b/source/x86/x86.c index 054a7d3..e0288d5 100644 --- a/source/x86/x86.c +++ b/source/x86/x86.c @@ -8548,10 +8548,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, * which can be collapsed into a single large-page spte. Later * page faults will create the large-page sptes. */ - if ((change != KVM_MR_DELETE) && - (old->flags & KVM_MEM_LOG_DIRTY_PAGES) && - !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) - kvm_mmu_zap_collapsible_sptes(kvm, new); /* * Set up write protection and/or dirty logging for the new slot.After migration failure the score is 148, and the TLB miss statistics of the qemu process is: linux-sjrfac:/mnt/zhoujian # perf stat -e dTLB-load-misses,dTLB-loads,dTLB-store-misses,dTLB-stores,iTLB-load-misses,iTLB-loads -p 12432 sleep 10
Performance counter stats for process id '12432':1,052,697 dTLB-load-misses # 0.19% of all dTLB cache hits (50.45%) 551,828,702 dTLB-loads (50.46%) 147,228 dTLB-store-misses (16.55%) 60,427,834 dTLB-stores (16.50%) 93,793 iTLB-load-misses # 7.43% of all iTLB cache hits (16.67%) 1,262,137 iTLB-loads (33.33%)
10.000709900 seconds time elapsed Regards, Jay Zhou
Regards, Wanpeng LiThese are the steps: ====== (1) the version of kmod is 4.4.11(with slightly modified) and the version of qemu is 2.6.0 (with slightly modified), the kmod is applied with the following patch according to Paolo's advice: diff --git a/source/x86/x86.c b/source/x86/x86.c index 054a7d3..75a4bb3 100644 --- a/source/x86/x86.c +++ b/source/x86/x86.c @@ -8550,8 +8550,10 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, */ if ((change != KVM_MR_DELETE) && (old->flags & KVM_MEM_LOG_DIRTY_PAGES) && - !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) - kvm_mmu_zap_collapsible_sptes(kvm, new); + !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) { + printk(KERN_ERR "zj make KVM_REQ_MMU_RELOAD request\n"); + kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD); + } /* * Set up write protection and/or dirty logging for the new slot. (2) I started up a memory preoccupied 10G VM(suse11sp3), which means its "RES column" in top is 10G, in order to set up the EPT table in advance. (3) And then, I run the test case 429.mcf of spec cpu2006 before migration and after migration failure. The 429.mcf is a memory intensive workload, and the migration failure is constructed deliberately with the following patch of qemu: diff --git a/migration/migration.c b/migration/migration.c index 5d725d0..88dfc59 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -625,6 +625,9 @@ static void process_incoming_migration_co(void *opaque) MIGRATION_STATUS_ACTIVE); ret = qemu_loadvm_state(f); + // deliberately construct the migration failure + exit(EXIT_FAILURE); + ps = postcopy_state_get(); trace_process_incoming_migration_co_end(ret, ps); if (ps != POSTCOPY_INCOMING_NONE) { ====== Results of the score and TLB miss rate are almost the same, and I am confused. May I ask which tool do you use to evaluate the performance? And if my test steps are wrong, please let me know, thank you. Regards, Jay Zhou.
[Prev in Thread] | Current Thread | [Next in Thread] |