qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] fix guest physical bits to match host, to go be


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH] fix guest physical bits to match host, to go beyond 1TB guests
Date: Wed, 17 Jul 2013 16:01:55 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7

Il 17/07/2013 15:39, Eduardo Habkost ha scritto:
> On Wed, Jul 17, 2013 at 10:09:01AM +0200, Paolo Bonzini wrote:
>> Il 16/07/2013 21:42, Eduardo Habkost ha scritto:
>>> On Tue, Jul 16, 2013 at 09:24:30PM +0200, Paolo Bonzini wrote:
>>>> Il 16/07/2013 20:11, Eduardo Habkost ha scritto:
>>>>> For physical bit size, what about extending it in a backwards-compatible
>>>>> way? Something like this:
>>>>>
>>>>>     *eax = 0x0003000; /* 48 bits virtual */
>>>>>     if (ram_size < 1TB) {
>>>>>         physical_size = 40; /* Keeping backwards compatibility */
>>>>>     } else if (ram_size < 4TB) {
>>>>>         physical_size = 42;
>>>>
>>>> Why not go straight up to 44?
>>>
>>> I simply trusted the comment saying: "The physical address space is
>>> limited to 42 bits in exec.c", and assumed we had a 42-bit limit
>>> somewhere else.
>>
>> Yeah, that's obsolete.  We now can go up to 64 (but actually only
>> support 52 because that's what Intel says will be the limit----4PB RAM
>> should be good for everyone, as Bill Gates used to say).
>>
>> So far Intel has been upgrading the physical RAM size in x16 steps
>> (MAXPHYADDR was 36, then 40, then 44).  MAXPHYADDR is how Intel calls
>> what you wrote as physical_size.
> 
> Then if ram_size is too large, we could round it up to a multiple of 4?

Yeah, that would be closer to real processors.

>>      if (supported_host_physical_size() < msb(ram_size)) {
>>          abort();
>>      }
> 
> What if the host max physical size is smaller than the MAXPHYADDR we're
> setting for the VM (but still larger than msb(ram_size))? Will the CPU
> or KVM complain, or will it work?

That would only happen with a buggy guest that points page tables to
non-existent RAM, i.e.:

- host has MAXPHYADDR = 36, corresponding to 64 GB physical address space

- guest has 2 GB RAM, but still we set MAXPHYADDR = 40 for backwards
compatibility

- guest creates a page table for RAM beyond 64GB just because it can

KVM would complain, real hardware probably would return all-ones or
something like that.  I think we can ignore it.

> In other words, do we need a check for
>   (supported_host_physical_size() < physical_size)
> below, as well?

That would have to be conditional on !some_compat_prop, too, because
MAXPHYADDR=40 is beyond what older VMX-enabled CPUs support (for example
Core 2 has MAXPHYADDR=36).  I don't think it's important though.  The
less stringent test vs. msb(ram_size) should be enough.

>>      if (ram_size < 64GB && !some_compat_prop) {
>>          physical_size = 36;
>>      } else if (ram_size < 1TB) {
>>          physical_size = 40;
>>      } else {
>>          physical_size = 44;
>>      }
>>
>> What do you think?
> 
> Why stop at 44? What if ram_size is larger than (1 << 44)?

You can certainly add more.  However, no processors exist yet that have
MAXPHYADDR > 44, so you would have already failed the check above.  I
didn't go beyond 44 because who knows, maybe Intel will go from 44 to 46
and break the tradition of increasing by 4.

> We must never start a VM with physical_size > msb(ram_size), right?
> 
> But I am not sure if we should we simply increase physical_size
> automatically, or abort on some cases where we physical_size ends up
> being too small. (I believe we should simply increase it automatically)

Increasing automatically is fine, that's basically what those "if"s do.

>>>> This makes sense too.  Though the best would be of course to use CPUID
>>>> values coming from the real processors, and only using 40 for backwards
>>>> compatibility.
>>>
>>> We can't use the values coming from the real processors directly, or
>>> we will break live migration.
>>
>> I said real processors, not host processors. :)
> 
> So, you mean setting per-cpu-model values?

Yes.

>> So a Core 2 has MAXPHYADDR=36, Nehalem has IIRC 40, Sandy Bridge has 44,
>> and so on.
> 
> That would work as well (and on pc-1.5 and older we could keep
> physical_size=40 or use the above algorithm). But: I wonder if it would
> be usable. If somebody is using "-cpu Nehalem -m 8T", I believe it would
> make sense to increase the physical address size automatically instead
> of aborting. What do you think?

I think I agree.  This leaf is used so little by the guest (the
hypervisor uses it) that mimicking real CPU models in everything is of
little value.

> If the VM is already broken and has MAXPHYADDR < msb(ram_size), I don't
> worry about ABI stability, because those VMs are not even supposed to be
> running properly. If somebody have such a broken VM running, it seems
> better to make the MAXPHYADDR bits suddenly change to a reasonable value
> (so that MAXPHYADDR >= msb(ram_size)) during live-migration than
> aborting migration, or keeping the broken value.
> 
> So if we use an algorithm that always increase MAXPHYADDR automatically
> (breaking the ABI in the cases where the current is already broken), we
> are not going to abort migration unless the host really can't run our
> guest (this seems to be a less serious problem than aborting because the
> VM configuration is broken and needs to be manually adjusted).

Agreed.  So perhaps per-cpu-model values aren't too useful.

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]