[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [RFC/PATCH 2] kvm: x86: handle KVM_SET_VCPU_EVENTS/KVM_VCPU
From: |
Herongguang (Stephen) |
Subject: |
[Qemu-devel] [RFC/PATCH 2] kvm: x86: handle KVM_SET_VCPU_EVENTS/KVM_VCPUEVENT_VALID_SMM properly |
Date: |
Thu, 22 Sep 2016 15:51:25 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 |
After making memory consistent between source and destination
(https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg03069.html), there can
still reproduce instruction emulation failure in destination side if migration
when VM’s in grub stage:
[2016-09-15 06:29:24] monitor_qapi_event_emit:478 {"timestamp": {"seconds": 1473892164,
"microseconds": 99652}, "event": "RESUME"}
KVM internal error. Suberror: 1
emulation failure
EAX=000000b5 EBX=00008fc6 ECX=00005678 EDX=00000000
ESI=000f254b EDI=00000000 EBP=0000f958 ESP=000ee958
EIP=00008000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =a000 000a0000 ffffffff 00809300
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT= 000f71a0 00000037
IDT= 00000000 00000000
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff <ff> ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[2016-09-15 06:29:24] virtio_set_status:712 virtio-blk device status is 7 that
means DRIVER OK
[2016-09-15 06:29:24] monitor_qapi_event_emit:478 {"timestamp": {"seconds": 1473892164,
"microseconds": 106738}, "event": "STOP"}
Note CS+EIP=0xA8000, which is within SMRAM, however SMM=0 is confusing! So I
found that in kvm_put_vcpu_events, events.flags(KVM_VCPUEVENT_VALID_SMM) is
overwritten by setting events.flags to 0, which is obvious mistaken.
So it comes to patch 3. After patch 3, however it results kvm-kmod crash:
[69328.761479] kvm [51353]: vcpu7, guest rIP: 0xfd31c unhandled wrmsr: 0xd1
data 0
[69337.406083] BUG: unable to handle kernel NULL pointer dereference at
(null)
[69337.414193] IP: [<ffffffffa056a59d>] gfn_to_rmap+0xcd/0xe0 [kvm]
[69337.420357] PGD 0
[69337.422514] Oops: 0000 [#1] SMP
[69337.425783] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs
fscache nfsd auth_rpcgss nfs_acl lockd grace sunrpc xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_conntrack ipt_REJECT
nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter
ip6_tables iptable_filter openvswitch nf_conntrack_ipv6 nf_nat_ipv6
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack
libcrc32c ipmi_devintf ipmi_si ipmi_msghandler intel_rapl sb_edac edac_core
x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel lrw kvm_intel gf128mul iTCO_wdt kvm glue_helper
iTCO_vendor_support ablk_helper cryptd pcspkr sg mei_me mei ioatdma shpchp
lpc_ich wmi mfd_core i2c_i801 vhost_net tun vhost macvtap macvlan ip_tables
ext4 jbd2 mbcache sr_mod sd_mod cdrom isci libsas igb ahci libahci
scsi_transport_sas crc32c_intel libata ptp pps_core serio_raw i2c_algo_bit
megaraid_sas i2c_co!
re dca d
m_mod vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio
[69337.524020] CPU: 3 PID: 51375 Comm: CPU 0/KVM Not tainted 4.7.0 #1
[69337.530315] Hardware name: Huawei Technologies Co., Ltd. Tecal RH2288
V2-8S/BC11SRSB1, BIOS RMIBV365 09/06/2013
[69337.540644] task: ffff88104bcc8000 ti: ffff881055650000 task.ti:
ffff881055650000
[69337.548382] RIP: 0010:[<ffffffffa056a59d>] [<ffffffffa056a59d>]
gfn_to_rmap+0xcd/0xe0 [kvm]
[69337.557105] RSP: 0018:ffff881055653a88 EFLAGS: 00010202
[69337.562541] RAX: ffffc9000b3cf428 RBX: ffff88085624a538 RCX: 000000000000000c
[69337.569788] RDX: ffff88083a6e2bd0 RSI: 00000000000000a7 RDI: 00000000000000a0
[69337.577036] RBP: ffff881055653a88 R08: 000000000000000c R09: ffffc9000b3cf008
[69337.584283] R10: ffffc9000b3cf000 R11: 0000000000000037 R12: ffff8810569d0000
[69337.591530] R13: 0000000000000000 R14: 00000000000000a7 R15: ffff8808d624a538
[69337.598777] FS: 00007faeb900d700(0000) GS:ffff88085e0c0000(0000)
knlGS:0000000000000000
[69337.607116] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[69337.612978] CR2: 0000000000000000 CR3: 000000104c203000 CR4: 00000000000426e0
[69337.620225] Stack:
[69337.622369] ffff881055653af0 ffffffffa056b28d ffffffff00000000
0000000000000001
[69337.630059] 0000000000000001 00007fadb50a7000 0100000000000000
ffff88083a6e2bd0
[69337.637751] ffff8810569d0000 0000000000000001 000ffffffffff000
ffff880000000000
[69337.645442] Call Trace:
[69337.648029] [<ffffffffa056b28d>] mmu_set_spte+0x19d/0x280 [kvm]
[69337.654166] [<ffffffffa056f56b>] __direct_map.part.122+0x19b/0x210 [kvm]
[69337.661078] [<ffffffffa056fa29>] tdp_page_fault+0x239/0x260 [kvm]
[69337.667385] [<ffffffffa0568bf0>] kvm_mmu_page_fault+0x60/0xf0 [kvm]
[69337.673856] [<ffffffffa03f8656>] handle_ept_violation+0x96/0x190 [kvm_intel]
[69337.681106] [<ffffffffa03ff740>] vmx_handle_exit+0x1f0/0xc30 [kvm_intel]
[69337.688019] [<ffffffffa055d9ce>] vcpu_enter_guest+0x90e/0x1190 [kvm]
[69337.694585] [<ffffffffa056444d>] kvm_arch_vcpu_ioctl_run+0xcd/0x3f0 [kvm]
[69337.701581] [<ffffffffa0549fb5>] kvm_vcpu_ioctl+0x295/0x610 [kvm]
[69337.707880] [<ffffffff8110b6cc>] ? do_futex+0x11c/0x530
[69337.713314] [<ffffffff8122e436>] do_vfs_ioctl+0xa6/0x5c0
[69337.718833] [<ffffffff81003276>] ? do_audit_syscall_entry+0x66/0x70
[69337.725303] [<ffffffff810037cf>] ? syscall_trace_enter_phase1+0x11f/0x140
[69337.732291] [<ffffffff8122e9c9>] SyS_ioctl+0x79/0x90
[69337.737470] [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[69337.743085] [<ffffffff816c88a1>] entry_SYSCALL64_slow_path+0x25/0x25
[69337.749641] Code: 8b 38 0f b6 52 28 5d 83 e2 0f 83 ea 01 8d 0c d2 48 63 d2 48 8b
44 d0 18 48 d3 ee 48 d3 ef 48 29 fe 48 8d 04 f0 c3 8d 48 01 eb 80 <48> 8b 3c 25
00 00 00 00 31 c0 eb cb 0f 1f 80 00 00 00 00 66 66
[69337.769707] RIP [<ffffffffa056a59d>] gfn_to_rmap+0xcd/0xe0 [kvm]
[69337.775938] RSP <ffff881055653a88>
[69337.779552] CR2: 0000000000000000
[69337.783467] ---[ end trace 5350f10b8de91e83 ]---
[69337.843957] Kernel panic - not syncing: Fatal exception
[69337.849430] Kernel Offset: disabled
I found that when kernel crashes, in tdp_page_fault ->__direct_map ->mmu_set_spte
->rmap_add->gfn_to_rmap-> __gfn_to_memslot &__gfn_to_rmap,
in __gfn_to_memslot returned NULL, and __gfn_to_rmap uses this NULL pointer
(slot) without checking, which is bad.
After some investigation, I found when kernel crashes, gfn is 0xA7, in SMRAM
region again. And at this time, kvm_memslots_for_spte_role is false,
however actually (vcpu->arch.hflags & HF_SMM_MASK) is true!
So I think there is some lacking in kvm-kmod’s
kvm_vcpu_ioctl_x86_set_vcpu_events that handles KVM_VCPUEVENT_VALID_SMM. I
tried following patch,
it seems works fine.
Do you think this patch is appropriate or not enough? Thanks.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 19f9f9e..f39e839 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3013,8 +3013,10 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct
kvm_vcpu *vcpu,
vcpu->arch.apic->sipi_vector = events->sipi_vector;
if (events->flags & KVM_VCPUEVENT_VALID_SMM) {
- if (events->smi.smm)
+ if (events->smi.smm) {
vcpu->arch.hflags |= HF_SMM_MASK;
+ kvm_mmu_reset_context(vcpu);
+ }
else
vcpu->arch.hflags &= ~HF_SMM_MASK;
vcpu->arch.smi_pending = events->smi.pending;
- [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Herongguang (Stephen), 2016/09/14
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Paolo Bonzini, 2016/09/14
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Herongguang (Stephen), 2016/09/22
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Herongguang (Stephen), 2016/09/22
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Paolo Bonzini, 2016/09/23
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Herongguang (Stephen), 2016/09/23
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Paolo Bonzini, 2016/09/23
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Herongguang (Stephen), 2016/09/23
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Herongguang (Stephen), 2016/09/25
- Re: [Qemu-devel] [RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination, Paolo Bonzini, 2016/09/26
[Qemu-devel] [RFC/PATCH 2] kvm: x86: handle KVM_SET_VCPU_EVENTS/KVM_VCPUEVENT_VALID_SMM properly,
Herongguang (Stephen) <=
[Qemu-devel] [RFC/PATCH 3] kvm: fix events.flags (KVM_VCPUEVENT_VALID_SMM) overwritten by 0, Herongguang (Stephen), 2016/09/22