qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [External] Re: [PATCH] target/i386: Fix cpuid level for AMD


From: Michael Roth
Subject: Re: [External] Re: [PATCH] target/i386: Fix cpuid level for AMD
Date: Fri, 2 Jul 2021 10:43:22 -0500

On Fri, Jul 02, 2021 at 01:14:56PM +0800, zhenwei pi wrote:
> On 7/2/21 4:35 AM, Michael Roth wrote:
> > Quoting Igor Mammedov (2021-07-01 03:43:13)
> > > On Wed, 30 Jun 2021 14:18:09 -0500
> > > Michael Roth <michael.roth@amd.com> wrote:
> > > 
> > > > Quoting Dr. David Alan Gilbert (2021-06-29 09:06:02)
> > > > > * zhenwei pi (pizhenwei@bytedance.com) wrote:
> > > > > > A AMD server typically has cpuid level 0x10(test on Rome/Milan), it
> > > > > > should not be changed to 0x1f in multi-dies case.
> > > > > > 
> > > > > > Fixes: a94e1428991 (target/i386: Add CPUID.1F generation support
> > > > > > for multi-dies PCMachine)
> > > > > > Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
> > > > > 
> > > > > (Copying in Babu)
> > > > > 
> > > > > Hmm I think you're right.  I've cc'd in Babu and Wei.
> > > > > 
> > > > > Eduardo: What do we need to do about compatibility, do we need to wire
> > > > > this to machine type or CPU version?
> > > > 
> > > > FWIW, there are some other CPUID entries like leaves 2 and 4 that are
> > > > also Intel-specific. With SEV-SNP CPUID enforcement, advertising them to
> > > > guests will result in failures when host SNP firmware checks the
> > > > hypervisor-provided CPUID values against the host-supported ones.
> > > > 
> > > > To address this we've been planning to add an 'amd-cpuid-only' property
> > > > to suppress them:
> > > > 
> > > >    
> > > > https://github.com/mdroth/qemu/commit/28d0553fe748d30a8af09e5e58a7da3eff03e21b
> > > > 
> > > > My thinking is this property should be off by default, and only defined
> > > > either via explicit command-line option, or via new CPU types. We're 
> > > > also
> > > > planning to add new CPU versions for EPYC* CPU types that set this
> > > > 'amd-cpuid-only' property by default:
> > > > 
> > > >    https://github.com/mdroth/qemu/commits/new-cpu-types-upstream
> > > It look like having new cpu versions is enough to change behavior,
> > > maybe keep 'amd-cpuid-only' as internal field and not expose it to users
> > > as a property.
> > 
> > Hmm, I defined it as a property mainly to make use of
> > X86CPUVersionDefinition.props to create new versions of the CPU types
> > with those properties set.
> > 
> > There's a patch there that adds X86CPUVersionDefinition.cache_info so
> > that new cache definitions can be provided for new CPU versions. So
> > would you suggest a similar approach here, e.g. adding an
> > X86CPUVersionDefinition.amd_cpuid_only field that could be used directly
> > rather than going through X86CPUVersionDefinition.props?
> > 
> > There's also another new "amd-xsave" prop in that series that does something
> > similar to "amd-cpuid-only", so a little worried about tacking to much extra
> > into X86CPUVersionDefinition. But maybe that one could just be rolled into
> > "amd-cpuid-only" since it is basically fixing up xsave-related cpuid
> > entries for AMD...
> > 
> Hi, this patch wants to fix the issue:
> AMD CPU (Rome/Milan) should get the cpuid level 0x10, not 0x1F in the guest.
> If QEMU reports a 0x1F to guest, guest(Linux) would use leaf 0x1F instead of
> leaf 0xB to get extended topology:
> 
> https://github.com/torvalds/linux/blob/master/arch/x86/kernel/cpu/topology.c#L49
> 
> static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
> {
>       if (c->cpuid_level >= 0x1f) {
>               if (check_extended_topology_leaf(0x1f) == 0)
>                       return 0x1f;
>       }
> 
>       if (c->cpuid_level >= 0xb) {
>               if (check_extended_topology_leaf(0xb) == 0)
>                       return 0xb;
>       }
> 
>       return -1;
> }
> 
> Because of the wrong cpuid level, the guest gets unexpected topology from
> leaf 0x1F.
> 
> I tested https://github.com/mdroth/qemu/commits/new-cpu-types-upstream, and
> it seems that these patches could not fix this issue.

Yes, I think your patch would still be needed. The question is whether it's
okay to change it for existing CPU types, e.g. EPYC-Milan, or only for new ones
when they set a certain flag/property, like the proposed "amd-cpuid-only" (which
the proposed EPYC-Milan-v2 would set).

> 
> > > 
> > > > So in general I think maybe this change should be similarly controlled 
> > > > by
> > > > this proposed 'amd-cpuid-only' property. Maybe for this particular case 
> > > > it's
> > > > okay to do it unconditionally, but it sounds bad to switch up the valid 
> > > > CPUID
> > > > range after a guest has already booted (which might happen with old->new
> > > > migration for instance), since it might continue treating values in the 
> > > > range
> > > > as valid afterward (but again, not sure that's the case here or not).
> > > > 
> > > > There's some other changes with the new CPU types that we're still
> > > > considering/testing internally, but should be able to post them in some 
> > > > form
> > > > next week.
> > > > 
> > > > -Mike
> > > > 
> > > > > 
> > > > > Dave
> > > > > > ---
> > > > > >   target/i386/cpu.c | 8 ++++++--
> > > > > >   1 file changed, 6 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > > > index a9fe1662d3..3934c559e4 100644
> > > > > > --- a/target/i386/cpu.c
> > > > > > +++ b/target/i386/cpu.c
> > > > > > @@ -5961,8 +5961,12 @@ void x86_cpu_expand_features(X86CPU *cpu, 
> > > > > > Error **errp)
> > > > > >               }
> > > > > >           }
> > > > > > -        /* CPU topology with multi-dies support requires 
> > > > > > CPUID[0x1F] */
> > > > > > -        if (env->nr_dies > 1) {
> > > > > > +        /*
> > > > > > +         * Intel CPU topology with multi-dies support requires 
> > > > > > CPUID[0x1F].
> > > > > > +         * For AMD Rome/Milan, cpuid level is 0x10, and guest OS 
> > > > > > should detect
> > > > > > +         * extended toplogy by leaf 0xB. Only adjust it for Intel 
> > > > > > CPU.
> > > > > > +         */
> > > > > > +        if ((env->nr_dies > 1) && IS_INTEL_CPU(env)) {
> > > > > >               x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 
> > > > > > 0x1F);
> > > > > >           }
> > > > > > -- 
> > > > > > 2.25.1
> > > > > > 
> > > > > -- 
> > > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > > > 
> > > > 
> > > 
> > > 
> 
> -- 
> zhenwei pi
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]