qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(P


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)
Date: Tue, 1 Dec 2015 14:41:25 +1100
User-agent: Mutt/1.5.24 (2015-08-30)

On Thu, Nov 12, 2015 at 08:46:27AM -0800, Nishanth Aravamudan wrote:
> On 12.11.2015 [15:47:15 +1100], David Gibson wrote:
> > On Wed, Nov 11, 2015 at 02:10:48PM -0800, Nishanth Aravamudan wrote:
> > > On 11.11.2015 [12:41:26 +1100], David Gibson wrote:
> > > > On Tue, Nov 10, 2015 at 04:56:38PM -0800, Nishanth Aravamudan wrote:
> > > > > On 11.11.2015 [11:17:58 +1100], David Gibson wrote:
> > > > > > On Mon, Nov 09, 2015 at 08:22:32PM -0800, Sukadev Bhattiprolu wrote:
> > > 
> > > <snip>
> > > 
> > > > > > The trouble with xscom is that it's extremely specific to the way 
> > > > > > the
> > > > > > current IBM servers present things.  It won't work on other types of
> > > > > > host machine (which could happen with PR KVM), and could even break 
> > > > > > if
> > > > > > IBM changes the way it organizes the SCOMs in a future machine.
> > > > > > 
> > > > > > Working from the nodes in /cpus still has some dependencies on IBM
> > > > > > specific properties, but it's at least partially based on OF
> > > > > > standards.
> > > > > > 
> > > > > > There's also another possible approach here, though I don't know if 
> > > > > > it
> > > > > > will work.  Instead of looking directly in the device tree, try to 
> > > > > > get
> > > > > > the information from lscpu, or libosinfo.  That would at least give
> > > > > > you some hope of providing meaningful information on other host 
> > > > > > types.
> > > > > 
> > > > > Heh, the issue that is underlying all of this, is that `lscpu` itself 
> > > > > is
> > > > > quite wrong.
> > > > > 
> > > > > On PAPR-compliant hypervisors (well, PowerVM, at least), the only
> > > > > supported means of determining the underlying hardware CPU information
> > > > > (which is what licensing models want in the end), is to use this RTAS
> > > > > call in an LPAR. `lscpu` is explicitly incorrect in these environments
> > > > > (it's values are "derived" from sysfs and some are adjusted to ensure
> > > > > the division of values works out).
> > > > 
> > > > So.. I'm not sure if you're just saying that lscpu is wrong because it
> > > > gives the guest information, or because of other problems.
> > > 
> > > `lscpu`'s man-page specifically says that on virtualized platforms, the
> > > output may be inaccurate. And, in fact, on Power, in a KVM guest (and
> > > in a LPAR), `lscpu` is outputting the guest CPU information, which is
> > > completely fake. This is true on x86 KVM guests too, afaict.
> > 
> > Um.. yes, I was assuming lscpu reporting information about virtual
> > cpus and sockets was intended and correct behaviour.
> 
> "lscpu - display information about the CPU architecture"

Right, without qualification I'd take that as virtual architecture.

> but at the same time "lscpu   gathers   CPU   architecture   information
> from   sysfs   and /proc/cpuinfo" which is explicitly logical (or
> virtual).
> 
> but at the same time "There is also information about the CPU caches and
> cache sharing, family, model, bogoMIPS, byte order, and stepping." which
> seems rather physical to me.

bogomips and byte order are absolutely properties of a virtual cpu.
As are family and model, really, since they're generally at least
partially visible to a guest, and there may be some capacity for
faking them (x86 is more flexible in this regard than Power).
Stepping might be depending on exactly what level the system is
virtualized at (it's not for the case of PAPR).

Cache info is probably purely physical but amongst everything else
that's a property of the virtual cpu, I don't think that's an argument
that lscpu should return host cpu information in general.


> So perhaps, as I kind of stumbled upon myself in my last reply, we
> should explicitly indicate the physical vs. virtual information.
> 
> I will raise this with the lscpu maintainer.
> 
> > > *If* we have a valid RTAS implementation on PowerKVM (or under qemu
> > > generally), I think we can modify `lscpu` to do the right thing in at
> > > least those two environments.
> > > 
> > > > What I was suggesting is implementing the RTAS call so that it
> > > > effectively lets the guest get lscpu information from the host.
> > > 
> > > A bit of a chicken & egg problem, I'd say. The `lscpu` output in PowerNV
> > > is also wrong :)
> > 
> > Ok.. why is it wrong in PowerNV?  This sounds like something you'd
> > want to fix anyway.
> 
> Yes, I never said we wouldn't? It's wrong on PowerNV because chips are
> being counted as sockets, i.e. a 2 DCM system is being counted as a 4
> socket system, rather than a 2 socket system.

Well, sure, but the fact that the tool for the job has a bug doesn't
seem like a great reason to re-implement that tool directly in qemu.

I don't see any chicken and egg problem here the *powernv* lscpu has
no dependency on the cpu hypercall information.

> > > > > So, we are trying to at least resolve what PowerKVM guest can see by
> > > > > supporting this RTAS call there. We should report *something* to the
> > > > > guest, if possible, and we can adjust what is reported to the guests 
> > > > > as
> > > > > we go, from the host perspective.
> > > > > 
> > > > > I haven't followed along too closely in this thread, but woudl it be
> > > > > reasonable to only report this RTAS call as being supported under
> > > > > KVM?
> > > > 
> > > > Possibly, yes.
> > > 
> > > At least, as a first step, I guess.
> > > 
> > > > > How are other RTAS calls dealt with for PR and non-IBM models
> > > > > currently?
> > > > 
> > > > Most of them still make sense in PR or TCG.  A few do look in the host
> > > > device tree, in which case they're likely to fail on non-KVM.
> > > 
> > > Got it, thanks.
> > > 
> > > So my investigation overall led me to this set of conclusions:
> > > 
> > > 1) Under PowerVM, we do not use this RTAS call, which is the only (as
> > > asserted by pHyp developers) valid way to get hardware information about
> > > the machine. Therefore, the PowerVM `lscpu` output is the "virtual" CPU
> > > information -- where cores are as defined by sharing of the L2-cache.
> > > 
> > > 2) Under PowerKVM, we do not use this RTAS call, because it's not
> > > supported, and just spit out whatever the qemu topology is (which has no
> > > connection to the host (physical) CPU information).
> > 
> > Right.. so does that mean nothing is using this call yet?
> 
> Correct.
> 
> > >  --> so if we implement the RTAS call of some sort under PowerKVM, then
> > > we can update `lscpu` to use that RTAS call.
> > 
> > Yeah, I'm not convinced that's correct.  Shouldn't lscpu return the
> > virtual cpu information, at least by default.
> 
> I think it should return both. *cough* this is a request from your
> employer, actually *cough* :) For billing purposes, physical topology is
> apparently relevant, not virtual (which makes sense, I can make a KVM
> guest with 100 sockets, but I definitely shouldn't be billed for 100
> sockets worth of RH seats, if the physical system only has 2 sockets).

Well, ok.  Do you have any contact information so I can find out
internally what it is they actually need?

> > > 3) Under PowerNV, there is a dependency on the hack that is ibm,chip-id
> > > from OPAL, which leads to twice as many sockets potentially being
> > > reported. `lscpu` also uses the sysfs files directly, which may or may
> > > not be the physical topology (I'm still tracking all of this down). 
> > > 
> > > *Also* `lscpu` has no knowledge of offline/online CPUs, so as you
> > > online/offline CPUs, the output of `lscpu` starts to change.
> > 
> > Ah, true.
> 
> Yeah, I'm still trying tofigure out the nuances of this out.
> 
> -Nish
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]