qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable


From: Igor Mammedov
Subject: Re: [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable
Date: Tue, 22 Feb 2022 09:46:02 +0100

On Mon, 21 Feb 2022 13:15:40 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Tue, Feb 15, 2022 at 10:53:58AM +0100, Gerd Hoffmann wrote:  
> > >   Hi,
> > >   
> > > > I don't know what behavior should be if firmware tries to program
> > > > PCI64 hole beyond supported phys-bits.  
> > > 
> > > Well, you are basically f*cked.
> > > 
> > > Unfortunately there is no reliable way to figure what phys-bits actually
> > > is.  Because of that the firmware (both seabios and edk2) tries to place
> > > the pci64 hole as low as possible.
> > > 
> > > The long version:
> > > 
> > > qemu advertises phys-bits=40 to the guest by default.  Probably because
> > > this is what the first amd opteron processors had, assuming that it
> > > would be a safe default.  Then intel came, releasing processors with
> > > phys-bits=36, even recent (desktop-class) hardware has phys-bits=39.
> > > Boom.
> > > 
> > > End result is that edk2 uses a 32G pci64 window by default, which is
> > > placed at the first 32G border beyond normal ram.  So for virtual
> > > machines with up to ~ 30G ram (including reservations for memory
> > > hotplug) the pci64 hole covers 32G -> 64G in guest physical address
> > > space, which is low enough that it works on hardware with phys-bits=36.
> > > 
> > > If your VM has more than 32G of memory the pci64 hole will move and
> > > phys-bits=36 isn't enough any more, but given that you probably only do
> > > that on more beefy hosts which can take >= 64G of RAM and have a larger
> > > physical address space this heuristic works good enough in practice.
> > > 
> > > Changing phys-bits behavior has been discussed on and off since years.
> > > It's tricky to change for live migration compatibility reasons.
> > > 
> > > We got the host-phys-bits and host-phys-bits-limit properties, which
> > > solve some of the phys-bits problems.
> > > 
> > >  * host-phys-bits=on makes sure the phys-bits advertised to the guest
> > >    actually works.  It's off by default though for backward
> > >    compatibility reasons (except microvm).  Also because turning it on
> > >    breaks live migration of machines between hosts with different
> > >    phys-bits.  
> > 
> > RHEL has shipped with host-phys-bits=on in its machine types
> > sinec RHEL-7. If it is good enough for RHEL machine types
> > for 8 years, IMHO, it is a sign that its reasonable to do the
> > same with upstream for new machine types.  
> 
> And the upstream code is now pretty much identical except for the
> default;  note that for TCG you do need to keep to 40 I think.

will TCG work with 40bits on host that supports less than that?

Also quick look at host-phys-bits shows that it affects only 'host'
cpu model and is NOP for all other models.
If it's so than we probably need to expand it's scope to other cpu
models to cap them at actually supported range.

> 
> Dave
> >   
> > >  * host-phys-bits-limit can be used to tweak phys-bits to
> > >    be lower than what the host supports.  Which can be used for
> > >    live migration compatibility, i.e. if you have a pool of machines
> > >    where some have 36 and some 39 you can limit phys-bits to 36 so
> > >    live migration from 39 hosts to 36 hosts works.  
> > 
> > RHEL machine types have set this to host-phys-bits-limit=48
> > since RHEL-8 days, to avoid accidentally enabling 5-level
> > paging in guests without explicit user opt-in.
> >   
> > > What is missing:
> > > 
> > >  * Some way for the firmware to get a phys-bits value it can actually
> > >    use.  One possible way would be to have a paravirtual bit somewhere
> > >    telling whenever host-phys-bits is enabled or not.  
> > 
> > 
> > Regards,
> > Daniel
> > -- 
> > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange 
> > :|
> > |: https://libvirt.org         -o-            https://fstop138.berrange.com 
> > :|
> > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange 
> > :|
> > 
> >   




reply via email to

[Prev in Thread] Current Thread [Next in Thread]