qemu-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-discuss] Supported hypervisors running VMs in nested VM


From: Bandan Das
Subject: Re: [Qemu-discuss] Supported hypervisors running VMs in nested VM
Date: Mon, 12 Oct 2015 22:05:54 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Hi Roel,

Rain Maker <address@hidden> writes:

> Thanks again Bandan.
>
> One more question for you. In the KVM source tree (arch/x86/kvm) of
> kernel 4.3-rc4, I see a file "hyperv.c". Comment states:
>
>  * KVM Microsoft Hyper-V emulation
>  *
>  * derived from arch/x86/kvm/x86.c
>  *
>  * Copyright (C) 2006 Qumranet, Inc.
>  * Copyright (C) 2008 Qumranet, Inc.
>  * Copyright IBM Corporation, 2008
>  * Copyright 2010 Red Hat, Inc. and/or its affiliates.
>  * Copyright (C) 2015 Andrey Smetanin <address@hidden>
>  *
>
> According to http://lxr.free-electrons.com/source/arch/x86/kvm/, this
> file was not there in 4.2.
>
> Do you know anything about this (seemingly recent) addition?

As per my understanding, the Hyper V architecture has a paravirtualization
interface called Enlightenment and this file has the relevant msrs and the
associated logic. I took a quick look at the commit message, seems it has
just been moved.

> On my own laptop, I also made a bit of headway. I'm now at the point
> where Hyper-V successfully installs. Did a brute hack to have
> vmx_get_msr() return "5" each time MSR_IA32_FEATURE_CONTROL is called.
> Not really the right way, but I'll worry about that once I actually
> get it running.

Ok, great. I think we should return what Hyper-V is expecting. Definitely
worth considering fixing if it helps in making progress with the installation.

> I am still unable to run L2 VMs though. Getting errors like
> "Hypervisor is not running". "systeminfo" now says everything is fine:
>
> Hyper-V Requirements:      VM Monitor Mode Extensions: Yes
>                            Virtualization Enabled In Firmware: Yes
>                            Second Level Address Translation: Yes
>                            Data Execution Prevention Available: Yes
>
> Which is why the installation now at least continues.
>
> When starting a L2 VM, these MSR (decimal) seem to get called
> (obtained by an equally hackish printk statement at the top of
> vmx_get_msr()). I guess I'll go by them one by one and see which one
> is "off". When booting the L1 VM, I also see a lot of negative numbers
> in there.
> Tried enabling ignore_msrs, but that didn't make any difference.

Note that access to any unimplemented msrs should print out messages to the
kernel logs and dmesg should show them. Please also look for other relevant 
messages
in the L0 dmesg. This might give more clues as to what's missing.

> I'll report back when (if) I get it actually running, because I
> believe it would really be cool if KVM were to support nested HV.
>
> MSRs called when a L2 machine is started:
> 1025
> 1029
> 1033
> 1037
> 1041
> 1045
> 1049
> 1053
> 1057
> 1061
>
> The top ones are actually all "MC0_STATUS". They return "0" on my host
> machine, and do the same in L1 VMs. So it must be one of the other
> ones. Unfortunately, most of them are not listed in the Intel spec..

I am assuming you are running a Windows L2. So, what might be happening is
that the L2 guest might be trying to access the enlightenment msrs which 
ofcourse
traps to L0. But we might not be doing the right thing with them. (Just a guess)

Thank you for keeping us posted!

> http://www.cs.inf.ethz.ch/stricker/lab/doc/intel-part4.pdf
>
> Sincerely,
> Roel Brook
>
>
> 2015-10-08 23:26 GMT+02:00 Bandan Das <address@hidden>:
>> Rain Maker <address@hidden> writes:
>>
>>> The screenshot and weird behavior I posted is within the L1 VM.
>>>
>>> So;
>>> - L0 (host / hypervisor): 0x3a = 5 immediately after boot. L1 VM is
>>> booted WITH -enable-kvm, nested=1, -hypervisor (also tried without
>>> this option), +vmx
>>>
>>> - L1 - Linux (VM / "sub-hypervisor"). 0x3a = 0 immediately after boot.
>>> When a L2 VM is booted with -enable-kvm, 0x3a changes to 5 .
>>> - L1 - Windows. 0x3a = ? (most likely, 0) Windows doesn't have tooling
>>> to read MSR as far as I could find.
>>>
>>> - L2 (under L1 Linux) - Boots fine. Doesn't matter whether I use
>>> -enable-kvm or not
>>> - L2 (under L1 Windows) - Does not start
>>>
>>> As far as I understand it, the BIOS / UEFI should set that MSR to "5".
>>> It should (again, as far as I understand, which is not that much) not
>>> be the task of the operating system.
>>>
>>> So, my question is;
>>> - Why would the MSR 0x3a be 0 after boot?
>>> - Why would it change to 5 after starting a L2 VM with -enable-kvm?
>>> - Is it the responsibility of the BIOS / UEFI code to set that MSR (as
>>> it does on my L0 host), or should the OS set this MSR appropriately?
>>
>> L1 doesn't see the "real" feature control msr. It sees an emulated version.
>> From your experiment, it seems that kvm "sets" it up only after the
>> initial stages of running a guest - vmon/vmload etc. So, you get an
>> expected value only after you start running a guest.
>>
>>> It looks to me like this is a bug somewhere in the Qemu / KVM BIOS
>>> code (MSR returned inappropriately). KVM seems to have a way to
>>> automatically correct this, but Windows does not. I tried this on a VM
>>> booted with the built-in seabios, as well as a VM using the OVMF UEFI
>>> firmware. No difference in behavior.
>>
>> This is KVM's responsibility and I believe that the correct behavior would
>> be set this if nested = 1 (if that is how it works on real hardware)
>> In that sense, it would be a bug but wouldn't be any useful.
>>
>>> Thank you very much for the help so far.
>>
>> BTW you are mixing up cases by using/not using "-enable-kvm" in your command
>> line. What I mentioned is specific to kvm only, qemu probably will always
>> return a 0 for certain msrs (like this one).
>>
>>> Roel Brook
>>>
>>>
>>>
>>>
>>> 2015-10-07 5:07 GMT+02:00 Bandan Das <address@hidden>:
>>>>
>>>>> On Oct 6, 2015, at 4:43 PM, Rain Maker <address@hidden> wrote:
>>>>>
>>>>> Unfortunately, no difference. WIth or without -hypervisor doesn't make
>>>>> any difference to that flag.
>>>>>
>>>>> But, experimenting on, I found something <very> odd.
>>>>> The 0x3a register is 0 when the VM boots up.
>>>>> Even when I start a L2 VM, 0x3a is still 0.
>>>>>
>>>>> However, once I start WITH -enable-kvm, 0x3a is suddenly 5(!). See
>>>>> this terminal session, which is executed within the L1 VM ("kvmtest").
>>>>> http://storage4.static.itmages.com/i/15/1006/h_1444163503_6656916_6ffbfd2352.png
>>>>>
>>>>> I was only executing mini.iso (an Ubuntu Netinstaller), and closing it
>>>>> at the boot prompt. I did not do anything in the L2 VM. Both the Qemu
>>>>> VM as the -enable-kvm VMs do boot.
>>>>>
>>>>> Is this how the MSR is supposed to react?
>>>>>
>>>>> AFAIK, the MSR can only be modified from kernelspace (which also
>>>>> explains why Qemu would only reset it with -enable-kvm, there are no
>>>>> kernelspace components used without it if I understand correctly)
>>>>>
>>>>> Looking at this, I can imagine that Windows does not detect a correct
>>>>> value. It will get 0. Would it make sense to cygwin KVM and see if
>>>>> that changes the MSR register?
>>>>>
>>>>
>>>> You have to use kvm to run (hardware assisted) nested virtualization. I am 
>>>> not
>>>> sure why you think Windows will read 0 for the feature control msr but you 
>>>> have to
>>>> use —enable-kvm in L0 when you are launching L1 (Hyper-V in your case). 
>>>> When L1 runs L2,
>>>> you don’t have to worry about using —enable-kvm. Hyper-V should 
>>>> automatically detect available
>>>> hardware features available to it and attempt to enable hardware 
>>>> virtualization.
>>>>
>>>>> Sincerely,
>>>>> Roel Brook.
>>>>>
>>>>> 2015-10-05 23:17 GMT+02:00 Bandan Das <address@hidden>:
>>>>>> Rain Maker <address@hidden> writes:
>>>>>>
>>>>>>> Qemu on Linux works fine. I did not even have to explicitly set
>>>>>>> -hypervisor. It simply works.
>>>>>>
>>>>>> Sorry, I meant running Linux with "-hypervisor" to see if specifying
>>>>>> that is somehow messing with the feature flags.
>>>>>>
>>>>>>> As does VirtualBox FYI.
>>>>>>>
>>>>>>> Booting with UEFI didn't make any difference.
>>>>>>>
>>>>>>> After A LOT of Googling, I believe that Hyper-V actually checks bit
>>>>>>> 0x3a of the MSR register (instead of, as the error would .
>>>>>>> This is a 3 bit register (IA32_FEATURE_CONTROL). Within my VM, the
>>>>>>> value returned by rdmsr is "0", while on my host it is "5". For
>>>>>>
>>>>>> That seems odd. Even Linux wouldn't work if that value is 0.
>>>>>>
>>>>>>> Hyper-V to work (from what I understand), it should be either 4 or 5.
>>>>>>>
>>>>>>> Googling that, funny enough, brought me back to this list:
>>>>>>> https://lists.gnu.org/archive/html/qemu-devel/2015-01/msg01371.html.
>>>>>>> I guess it really IS important to know what you're googling for to
>>>>>>> find things fast...
>>>>>>>
>>>>>>> That thread simply says that the kernel is too old. Well, my host is
>>>>>>> running 4.2, so should be new enough.
>>>>>>> I'm a bit stuck. Any ideas?
>>>>>>>
>>>>>>> Sincerely,
>>>>>>> Roel Brook
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-10-05 21:18 GMT+02:00 Bandan Das <address@hidden>:
>>>>>>>> Rain Maker <address@hidden> writes:
>>>>>>>>
>>>>>>>>> Thanks Bandan.
>>>>>>>>>
>>>>>>>>> That helped a bit. It got me to the next hurdle, as you suspected.
>>>>>>>>>
>>>>>>>>> I modified the virsh XML so that -cpu host,+vmx,-hypervisor is passed,
>>>>>>>>> and the installation now reports "Hyper-V cannot be installed because
>>>>>>>>> virtualization support is not enabled in the BIOS.".
>>>>>>>>
>>>>>>>> Thanks for trying this out.
>>>>>>>>
>>>>>>>>> I am sure that vmx is passed. but "systeminfo" does report "Hyper-V
>>>>>>>>> cannot be installed because virtualization support is not enabled in
>>>>>>>>> the BIOS."
>>>>>>>>>
>>>>>>>>> Apparently, Microsoft queries the BIOS to verify that the
>>>>>>>>
>>>>>>>> When kvm is initialized, it checks for TXT and VMX both being enabled.
>>>>>>>> That too, only if the feature control msr is locked. I don't think 
>>>>>>>> there
>>>>>>>> are actually any specific "bios calls" to find this out. I would assume
>>>>>>>> Hyper-V should be doing the same thing but your testing says otherwise.
>>>>>>>>
>>>>>>>> Can you please run linux as L1 with "-hypervisor" and see if it works ?
>>>>>>>> If it doesn't, please check dmesg for relevant messages.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bandan
>>>>>>>>
>>>>>>>>
>>>>>>>>> virtualization bit is actually enabled, instead of simply relying on
>>>>>>>>> the VMX flag.
>>>>>>>>> Unfortunately, VMs are still not starting either. The seabios in Qemu
>>>>>>>>> seems to be pretty difficult to modify. I'll check whether I can
>>>>>>>>> reinstall on UEFI, maybe that is going to make a difference.
>>>>>>>>>
>>>>>>>>> The way VMWare does this is actually semi-documented (it hasn't always
>>>>>>>>> been in the product, and a workaround involving manually editing the
>>>>>>>>> configuration has been used for a long time). I'll see if I can
>>>>>>>>> correlate these to Qemu options, to see whether we can use those
>>>>>>>>> instructions to get this working on Qemu.
>>>>>>>>>
>>>>>>>>> 1. Set 'vhv.enable = "TRUE" on the VM
>>>>>>>>>  It "enables virtual hardware virtualization". This seems equivalent
>>>>>>>>> to the -hypervisor flag
>>>>>>>>>
>>>>>>>>> 2. Set 'monitor.virtual_exec = "hardware" on the VM.
>>>>>>>>>  This option seems to force hardware virtualization for both CPU and
>>>>>>>>> MMU. Unsure whether there's an equivalent Qemu configuration option.
>>>>>>>>> Unsure whether it's needed on Qemu. Details at
>>>>>>>>> http://www.vmware.com/files/pdf/perf-vsphere-monitor_modes.pdf
>>>>>>>>>
>>>>>>>>> 3. Set hypervisor.cpuid.v0 = “FALSE” in the VM configuration
>>>>>>>>>  This seems synonymous to the +vmx flag
>>>>>>>>>
>>>>>>>>> 4. Enable the option to "Virtualize VT-x/EPT or AMD/RVI"
>>>>>>>>>  I have not found any option to explicitly do this in Qemu. Looking
>>>>>>>>> at my Ubuntu VM, the "ept" flag IS passed to the VM, so this should be
>>>>>>>>> OK.
>>>>>>>>>
>>>>>>>>> 5. Add the following CPU mask Level ECX: ---- ---- ---- ---- ---- 
>>>>>>>>> ---- --H- ----
>>>>>>>>>  Not sure how to do that in Qemu or what it does. Looking at
>>>>>>>>> https://en.wikipedia.org/wiki/CPUID, it seems to disable the XSAVE
>>>>>>>>> instruction(?). For fun, I passed -cpu ...-xsave, but it did not seem
>>>>>>>>> to make any difference whatsoever.
>>>>>>>>>
>>>>>>>>> Sincerely,
>>>>>>>>> Roel Brook
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015-10-04 5:07 GMT+02:00 Bandan Das <address@hidden>:
>>>>>>>>>> ...
>>>>>>>>>>> Windows 2012 / 2016 technical preview 3
>>>>>>>>>>> --------------------------------------------------------
>>>>>>>>>>> The installation via the "default" method of Add/Remove Features 
>>>>>>>>>>> does
>>>>>>>>>>> not work. Hyper-V displays the error message "A hypervisor is 
>>>>>>>>>>> already
>>>>>>>>>>> running".
>>>>>>>>>>>
>>>>>>>>>>> This check can be skipped by using a different method of 
>>>>>>>>>>> installation
>>>>>>>>>>> (from PowerShell):
>>>>>>>>>>> Enable-WindowsOptionalFeature –Online -FeatureName Microsoft-Hyper-V
>>>>>>>>>>> –All -NoRestart
>>>>>>>>>>>
>>>>>>>>>>> This results in (again) the server booting up, but being unable to 
>>>>>>>>>>> run
>>>>>>>>>>> any guest VMs. The error message is less clear then that in 2008, 
>>>>>>>>>>> just
>>>>>>>>>>> "The Virtual Machine Management Service failed to start the virtual
>>>>>>>>>>> machine 'New Virtual Machine' because one of the Hyper-V components 
>>>>>>>>>>> is
>>>>>>>>>>> not running (Virtual machine ID
>>>>>>>>>>> 0C063B29-249A-41C8-8A5B-6D4D2E37EF7C)."
>>>>>>>>>>> is what I could find.
>>>>>>>>>>>
>>>>>>>>>>> Other
>>>>>>>>>>> --------
>>>>>>>>>>> Just to verify that "nesting" is actually working, I've also 
>>>>>>>>>>> installed
>>>>>>>>>>> a Ubuntu 15.10 VM and installed Qemu on it.
>>>>>>>>>>> This combination CAN successfully run a VM.
>>>>>>>>>>>
>>>>>>>>>>> I've also installed VirtualBox on one of the Windows VMs. This
>>>>>>>>>>> VirtualBox instance is also capable of running virtual machines.
>>>>>>>>>>> According to the icon in the bottom right, VirtualBox IS using the
>>>>>>>>>>> hardware virtualization.
>>>>>>>>>>>
>>>>>>>>>>> Is this a problem specific to Hyper-V? Is there a method to get
>>>>>>>>>>
>>>>>>>>>> Nesting a Hyper-V L1 hypervisor is largely untested. But one of the 
>>>>>>>>>> problems I recollect is that Hyper-V doesn’t like running in a 
>>>>>>>>>> virtualized environment. It checks the “hypervisor” feature flag 
>>>>>>>>>> that Qemu exports. You could try running qemu with “-cpu  
>>>>>>>>>> host,-hypervisor” or something similar and see if it makes any 
>>>>>>>>>> difference. I suspect there would be other roadblocks though, this 
>>>>>>>>>> is just one of them.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hyper-V working including running guests? I know for a fact that
>>>>>>>>>>> VMWare Workstation / ESX is able to run Hyper-V fully, so it should
>>>>>>>>>>
>>>>>>>>>> Yes, IIRC one of the things ESX does is hide the hypervisor flag 
>>>>>>>>>> specifically for Hyper-V.
>>>>>>>>>>
>>>>>>>>>> Bandan
>>>>>>>>>>
>>>>>>>>>>> not be completely impossible (but I dislike VMWare for different
>>>>>>>>>>> reasons).
>>>>>>>>>>>
>>>>>>>>>>> My Qemu command line (generated by virt-manager). Except for disks 
>>>>>>>>>>> and
>>>>>>>>>>> domain names, all are identical:
>>>>>>>>>>>
>>>>>>>>>>> qemu-system-x86_64 -enable-kvm -name Windows_2008_R2 -S -machine
>>>>>>>>>>> pc-i440fx-vivid,accel=kvm,usb=off -cpu
>>>>>>>>>>> SandyBridge,+invtsc,+osxsave,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
>>>>>>>>>>> -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid
>>>>>>>>>>> 54a8f3a3-66c2-45a5-a280-ecf7019a67fa -no-user-config -nodefaults
>>>>>>>>>>> -chardev 
>>>>>>>>>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/Windows_2008_R2.monitor,server,nowait
>>>>>>>>>>> -mon chardev=charmonitor,id=monitor,mode=control -rtc
>>>>>>>>>>> base=localtime,driftfix=slew -global 
>>>>>>>>>>> kvm-pit.lost_tick_policy=discard
>>>>>>>>>>> -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global
>>>>>>>>>>> PIIX4_PM.disable_s4=1 -boot strict=on -device
>>>>>>>>>>> ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device
>>>>>>>>>>> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6
>>>>>>>>>>> -device 
>>>>>>>>>>> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1
>>>>>>>>>>> -device 
>>>>>>>>>>> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2
>>>>>>>>>>> -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 
>>>>>>>>>>> -drive
>>>>>>>>>>> file=/sub/kvm/Windows_2008_R2.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=unsafe,aio=threads
>>>>>>>>>>> -device 
>>>>>>>>>>> ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1
>>>>>>>>>>> -drive 
>>>>>>>>>>> file=/sub/ISO/en_windows_server_2008_r2_with_sp1_x64_dvd_617601.iso,if=none,id=drive-ide0-0-1,readonly=on,format=raw
>>>>>>>>>>> -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1
>>>>>>>>>>> -netdev tap,fd=24,id=hostnet0 -device
>>>>>>>>>>> rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:7b:d7:d2,bus=pci.0,addr=0x3
>>>>>>>>>>> -chardev pty,id=charserial0 -device
>>>>>>>>>>> isa-serial,chardev=charserial0,id=serial0 -chardev
>>>>>>>>>>> spicevmc,id=charchannel0,name=vdagent -device
>>>>>>>>>>> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
>>>>>>>>>>> -device usb-tablet,id=input0 -spice
>>>>>>>>>>> port=5903,addr=127.0.0.1,disable-ticketing,seamless-migration=on
>>>>>>>>>>> -device 
>>>>>>>>>>> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2
>>>>>>>>>>> -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device
>>>>>>>>>>> hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev
>>>>>>>>>>> spicevmc,id=charredir0,name=usbredir -device
>>>>>>>>>>> usb-redir,chardev=charredir0,id=redir0 -chardev
>>>>>>>>>>> spicevmc,id=charredir1,name=usbredir -device
>>>>>>>>>>> usb-redir,chardev=charredir1,id=redir1 -chardev
>>>>>>>>>>> spicevmc,id=charredir2,name=usbredir -device
>>>>>>>>>>> usb-redir,chardev=charredir2,id=redir2 -chardev
>>>>>>>>>>> spicevmc,id=charredir3,name=usbredir -device
>>>>>>>>>>> usb-redir,chardev=charredir3,id=redir3 -device
>>>>>>>>>>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on
>>>>>>>>>>>
>>>>>>>>>>> Thank you in advance for response.
>>>>>>>>>>>
>>>>>>>>>>> Sincerely,
>>>>>>>>>>> Roel Brook
>>>>>>>>>>>
>>>>>>>>>>
>>>>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]