[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUI
From: |
Zhao Liu |
Subject: |
Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H |
Date: |
Wed, 15 Feb 2023 22:33:55 +0800 |
On Wed, Feb 15, 2023 at 06:11:23PM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 18:11:23 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs
> in CPUID.04H
>
> Hi Zhao,
>
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> >
> > For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
> > CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
> > both 0, and this means i-cache and d-cache are shared in the SMT level.
> > This is correct if there's single thread per core, but is wrong for the
> > hyper threading case (one core contains multiple threads) since the
> > i-cache and d-cache are shared in the core level other than SMT level.
> >
> > Therefore, in order to be compatible with both multi-threaded and
> > single-threaded situations, we should set i-cache and d-cache be shared
> > at the core level by default.
> >
> > Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> > CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> > nearest power-of-2 integer.
> >
> > The nearest power-of-2 integer can be caculated by pow2ceil() or by
> > using APIC ID offset (like L3 topology using 1 << die_offset [3]).
> >
> > But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> > are associated with APIC ID. For example, in linux kernel, the field
> > "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> > another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> > matched with actual core numbers and it's caculated by:
> > "(1 << (pkg_offset - core_offset)) - 1".
> >
> > Therefore the offset of APIC ID should be preferred to caculate nearest
> > power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> > 31:26]:
> > 1. d/i cache is shared in a core, 1 << core_offset should be used
> > instand of "1" in encode_cache_cpuid4() for CPUID.04H.00H:EAX[bits
> > 25:14] and CPUID.04H.01H:EAX[bits 25:14].
> > 2. L2 cache is supposed to be shared in a core as for now, thereby
> > 1 << core_offset should also be used instand of "cs->nr_threads" in
> > encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> > 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
> > replaced by the offsets upper SMT level in APIC ID.
> >
> > And since [1] and [2] are good enough to make cache_into_passthrough
> > work well, its "pow2ceil()" uses are enough so that they're no need to
> > be replaced by APIC ID offset way.
> If you uniformly tweak these two places with APIC ID offset too, then
> you can also use the more spec-compliant helpers
> (e.g max_processor_ids_for_cache and max_core_ids_in_pkg) here in
> future patch #18. Would be it best to uniform the code?
Yes, it makes sense! Will also do that. Thanks!
>
> Thanks,
> Yanan
> > [1]: efb3934 (x86: cpu: make sure number of addressable IDs for processor
> > cores meets the spec)
> > [2]: d7caf13 (x86: cpu: fixup number of addressable IDs for logical
> > processors sharing cache)
> > [3]: d65af28 (i386: Update new x86_apicid parsing rules with die_offset
> > support)
> >
> > Fixes: 7e3482f (i386: Helpers to encode cache information consistently)
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > target/i386/cpu.c | 20 +++++++++++++++-----
> > 1 file changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 29afec12c281..7833505092d8 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5212,7 +5212,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index,
> > uint32_t count,
> > {
> > X86CPU *cpu = env_archcpu(env);
> > CPUState *cs = env_cpu(env);
> > - uint32_t die_offset;
> > uint32_t limit;
> > uint32_t signature[3];
> > X86CPUTopoInfo topo_info;
> > @@ -5308,27 +5307,38 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
> > index, uint32_t count,
> > *eax = *ebx = *ecx = *edx = 0;
> > } else {
> > *eax = 0;
> > + int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> > + apicid_core_offset(&topo_info);
> > + int core_offset, die_offset;
> > +
> > switch (count) {
> > case 0: /* L1 dcache info */
> > + core_offset = apicid_core_offset(&topo_info);
> > encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> > - 1, cs->nr_cores,
> > + (1 << core_offset),
> > + (1 << addressable_cores_offset),
> > eax, ebx, ecx, edx);
> > break;
> > case 1: /* L1 icache info */
> > + core_offset = apicid_core_offset(&topo_info);
> > encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> > - 1, cs->nr_cores,
> > + (1 << core_offset),
> > + (1 << addressable_cores_offset),
> > eax, ebx, ecx, edx);
> > break;
> > case 2: /* L2 cache info */
> > + core_offset = apicid_core_offset(&topo_info);
> > encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> > - cs->nr_threads, cs->nr_cores,
> > + (1 << core_offset),
> > + (1 << addressable_cores_offset),
> > eax, ebx, ecx, edx);
> > break;
> > case 3: /* L3 cache info */
> > die_offset = apicid_die_offset(&topo_info);
> > if (cpu->enable_l3_cache) {
> > encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> > - (1 << die_offset), cs->nr_cores,
> > + (1 << die_offset),
> > + (1 << addressable_cores_offset),
> > eax, ebx, ecx, edx);
> > break;
> > }
>
[PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid(), Zhao Liu, 2023/02/13
[PATCH RESEND 06/18] i386: Introduce module-level cpu topology to CPUX86State, Zhao Liu, 2023/02/13
[PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo, Zhao Liu, 2023/02/13