qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches
Date: Tue, 5 Jul 2016 11:13:26 +0100
User-agent: Mutt/1.6.1 (2016-04-27)

* Michael S. Tsirkin (address@hidden) wrote:
> On Tue, Jul 05, 2016 at 10:33:25AM +0100, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (address@hidden) wrote:
> > > On Mon, Jul 04, 2016 at 08:16:03PM +0100, Dr. David Alan Gilbert (git) 
> > > wrote:
> > > > From: "Dr. David Alan Gilbert" <address@hidden>
> > > > 
> > > > QEMU sets the guests physical address bits to 40; this is wrong
> > > > on most hardware, and can be detected by the guest.
> > > > It also stops you using really huge multi-TB VMs.
> > > > 
> > > > Red Hat has had a patch, that Andrea wrote, downstream for a couple
> > > > of years that reads the hosts value and uses that in the guest.  That's
> > > > correct as far as the guest sees it, and lets you create huge VMs.
> > > > 
> > > > The downside, is that if you've got a mix of hosts, say an i7 and a 
> > > > Xeon,
> > > > life gets complicated in migration; prior to 2.6 it all apparently
> > > > worked (although a guest that looked might spot the change).
> > > > In 2.6 Paolo started checking MSR writes and they failed when the
> > > > incoming MTRR mask didn't fit.
> > > > 
> > > > This series:
> > > >    a) Fixes up mtrr masks so that if you're migrating between hosts
> > > >       of different physical address size it tries to do something 
> > > > sensible.
> > > > 
> > > >    b) Lets you specify the guest physical address size via a CPU 
> > > > property, i.e.
> > > >         -cpu SandyBridge,phys-bits=36
> > > > 
> > > >       The default on old machine types is to use the existing 40 bits 
> > > > value.
> > > > 
> > > >    c) Lets you tell qemu to use the same setting as the host, i.e.
> > > >         -cpu SandyBridge,phys-bits=0
> > > >  
> > > >       This is the default on new machine types.
> > > > 
> > > > Note that mixed size hosts are still not necessarily safe; a guest
> > > > started on a host with a large physical address size might start using
> > > > those bits and get upset when it's moved to a small host.
> > > > However that was already potentially broken in existing qemu that
> > > > used a magic value of 40.
> > > > 
> > > > There's potential to add some extra guards against people
> > > > doing silly stuff; e.g. stop people running VMs using 1TB of
> > > > address space on a tiny host.
> > > > 
> > > > Dave
> > > 
> > > This is all in target-i386 so if the maintainers want it this way, they
> > > can merge this, and I do not have strong objections, but I wanted to
> > > document an alternative that is IMHO somewhat nicer. Feel free to
> > > ignore.  See below.
> > > 
> > > How can guest use more memory than what host supports?
> > > I think there are two ways:
> > > 
> > > 1. more memory than host supports is supplied
> > >    This is a configuration error. We can simply detect this
> > >    and fail init, or print a warning, no need for new flags.
> > 
> > Yes we should do that; however there's a case that's potentially
> > currently working for people but actually kind of illegal.
> > That case is specifying a small amount of actual memory
> > but a large maxmem - i.e.:
> > 
> >      -m 2G,slots=16,maxmem=2T
> > 
> > On a host with a 39bit physaddress limit do you error
> > on that or not?  I think oVirt is currently doing something
> > similar to that, but I'm trying to get confirmation.
> 
> That would only be a problem since pci is allocated above
> maxmem so 64 bit pci addresses aren't accessible.
> With my proposal we can actually force firmware to avoid
> using 64 bit memory for that config.
> Will work better than today.
> 
> 
> > > 2. pci addresses out of host range assigned by guest
> > >    Again normally at least seabios will not do this,
> > >    maybe OVMF will?
> > >    we certainly can add an interface telling firmware
> > >    what the limit is.
> > > 
> > > Thus an alternative is:
> > > - add interface to tell QEMU how much 64 bit memory can pci use.
> > > - teach firmware to limit itself to that
> > > - set guest bits to 48 unconditionally
> > > 
> > > 
> > > the disadvantage of this approach is that firmware needs to be changed
> > 
> > I guess it also needs the CRS to tell the guest OS not
> > to remap PCI stuff into that space?
> 
> CRS is a list of legal addresses, not list of illegal ones.
> So just don't include what's illegal there.
> 
> >  I thought also from the previous
> > discussions that the guest would get a different exception if it
> > actually tried to use any of the bits below 48 it didn't have.
> 
> Basically if you try to map pci at an address outside CRS
> you can get any kind of crash since there could be on-board
> hardware handling these addresses.
> So I do not think we care about that.

The issue about guest bits is not purely about PCI addresses though;
I thought it was also to do with visible behaviour/exceptions in
page tables.

> > > the advantage is that we get seemless migration between different
> > > hosts as long as they both can support the configuration,
> > > without any management effort.
> > 
> > The reality (Linux guest) is that this already works as long as you don't
> > map anything into the high address space, and the firmware wont do
> > that unless it's pushed to by an excessive maxmem or huge
> > 64bit PCI bars.
> > 
> > Dave
> 
> Right. So the disadvantage isn't big at all, and I think advantages
> outweight it.

Except that no one will ever get around to writing the firmware changes
for both sets of firmware; so we never move forward?

Dave

> 
> > > 
> > > > 
> > > > v2
> > > >   Default on new machine types is to read from the host
> > > >   Use the MAKE_64BIT_MASK macro
> > > >   Validate phys_bits in the realise method
> > > >   Move reading the host physical bits to the realise method
> > > >   Set phys-bits even for 32bit guests
> > > >   Add warning when your phys-bits doesn't match your host in the none
> > > >     default case
> > > > 
> > > > Dr. David Alan Gilbert (6):
> > > >   x86: Allow physical address bits to be set
> > > >   x86: Mask mtrr mask based on CPU physical address limits
> > > >   x86: fill high bits of mtrr mask
> > > >   x86: Set physical address bits based on host
> > > >   x86: fix up 32 bit phys_bits case
> > > >   x86: Add sanity checks on phys_bits
> > > > 
> > > >  include/hw/i386/pc.h | 10 ++++++++
> > > >  target-i386/cpu.c    | 71 
> > > > ++++++++++++++++++++++++++++++++++++++++++++++------
> > > >  target-i386/cpu.h    |  6 +++++
> > > >  target-i386/kvm.c    | 36 +++++++++++++++++++++++---
> > > >  4 files changed, 112 insertions(+), 11 deletions(-)
> > > > 
> > > > -- 
> > > > 2.7.4
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]