qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC V3 1/4] Use Aff1 with mpidr This is an impro


From: Igor Mammedov
Subject: Re: [Qemu-devel] [PATCH RFC V3 1/4] Use Aff1 with mpidr This is an improved and KVM-aware alternative to
Date: Mon, 8 Jun 2015 12:32:49 +0200

On Thu, 4 Jun 2015 18:17:39 +0100
Peter Maydell <address@hidden> wrote:

> On 4 June 2015 at 17:40, Shlomo Pongratz <address@hidden> wrote:
> > From: Pavel Fedin <address@hidden>
> 
> Hi. I think this patch largely makes sense, but I have some comments
> below. If you want to fix these and resend as a standalone patch
> I'm happy to apply that.
> 
> > 1. MPIDR value (minus feature bits) is now directly stored in CPU instance.
> > 2. Default assignment is based on original rule (8 CPUs per cluster).
> > 3. If KVM is used, MPIDR values are overridden with those from KVM. This is
> > necessary for
> > proper KVM PSCI functioning.
> > 4. Some cosmetic changes which would make further expansion of this code 
> > easier.
> > 5. Added some debugging macros because CPU ID assignment is tricky part, 
> > and if
> > there are
> > some problems, i think there should be a quick way to make sure they are
> > correct.
> >  This does not have an RFC mark because it is perfectly okay to be committed
> > alone, it
> > does not break anything. Commit message follows. Cc'ed to all interested
> > parties because i
> > really hope to get things going somewhere this time.
> 
> This sort of commentary belongs below the '---' line, so it doesn't
> end up in the final git commit message.
> 
> > In order to support up to 128 cores with GIC-500 (GICv3 implementation)
> > affinity1 must be used. GIC-500 support up to 32 clusters with up to
> > 8 cores in a cluster. So for example, if one wishes to have 16 cores,
> > the options are: 2 clusters of 8 cores each, 4 clusters with 4 cores each
> > Currently only the first option is supported for TCG. However, KVM expects
> > to have the same clusters layout as host machine has. Therefore, if we use
> > 64-bit KVM we override CPU IDs with those provided by the KVM. For 32-bit
> > systems it is OK to leave the default because so far we do not have more
> > than 8 CPUs on any of them.
> > This implementation has a potential to be extended with some methods which
> > would define cluster layout instead of hardcoded CPUS_PER_CLUSTER
> > definition. This would allow to emulate real machines with different
> > layouts. However, in case of KVM we would still have to inherit IDs from
> > the host.
> 
> I agree that we want to be using the same idea of MPIDR as KVM
> when KVM is enabled, certainly. But this commit message has
> a number of inaccuracies:
>  * KVM doesn't inherit IDs from the host
>  * we don't need to have the same cluster layout as the host machine
>  * this isn't specific to 64-bit VMs so we should fix 32 bits at the same time
> 
> The bug we're actually trying to fix here is just that the
> kernel's mapping from VCPU ID to MPIDR value has changed
> from the simplistic implementation it originally had. So in
> userspace we need to read the MPIDR value back from the kernel
> rather than assuming we know the mapping. (It happens that the
> reason the kernel changed was to accommodate GICv3 and its
> restrictions on affinity values, but that's not relevant to why
> this patch is needed.)
> 
> > Signed-off-by: Shlomo Pongratz <address@hidden>
> > Signed-off-by: Pavel Fedin <address@hidden>
> > ---
> >  hw/arm/virt.c        |  6 +++++-
> >  target-arm/cpu-qom.h |  3 +++
> >  target-arm/cpu.c     | 17 +++++++++++++++++
> >  target-arm/helper.c  |  9 +++------
> >  target-arm/kvm64.c   | 25 +++++++++++++++++++++++++
> >  target-arm/psci.c    | 20 ++++++++++++++++++--
> >  6 files changed, 71 insertions(+), 9 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 05db8cb..19c8c3a 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -294,7 +294,11 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo *vbi)
> >                                          "enable-method", "psci");
> >          }
> >
> > -        qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", cpu);
> > +        /*
> > +         * If cpus node's #address-cells property is set to 1
> > +         * The reg cell bits [23:0] must be set to bits [23:0] of 
> > MPIDR_EL1.
> > +         */
> 
> This comment is either obvious or misleading depending on your
> point of view.
> 
> > +        qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", armcpu->mpidr);
> >          g_free(nodename);
> >      }
> >  }
> > diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
> > index ed5a644..a382a09 100644
> > --- a/target-arm/cpu-qom.h
> > +++ b/target-arm/cpu-qom.h
> > @@ -159,6 +159,7 @@ typedef struct ARMCPU {
> >      uint64_t id_aa64mmfr1;
> >      uint32_t dbgdidr;
> >      uint32_t clidr;
> > +    uint64_t mpidr; /* Without feature bits */
> 
> I think it would be less confusing to store the entire MPIDR
> here, and then mask it out if there are places that only
> want the affinity bits.
> 
> >      /* The elements of this array are the CCSIDR values for each cache,
> >       * in the order L1DCache, L1ICache, L2DCache, L2ICache, etc.
> >       */
> > @@ -171,6 +172,8 @@ typedef struct ARMCPU {
> >      uint64_t rvbar;
> >  } ARMCPU;
> >
> > +#define CPUS_PER_CLUSTER 8
> 
> Changes I suggest below will mean this #define is only
> used in one place so it won't need to be in a header.
> 
> > +
> >  #define TYPE_AARCH64_CPU "aarch64-cpu"
> >  #define AARCH64_CPU_CLASS(klass) \
> >      OBJECT_CLASS_CHECK(AArch64CPUClass, (klass), TYPE_AARCH64_CPU)
> > diff --git a/target-arm/cpu.c b/target-arm/cpu.c
> > index 4a888ab..7272466 100644
> > --- a/target-arm/cpu.c
> > +++ b/target-arm/cpu.c
> > @@ -31,6 +31,12 @@
> >  #include "sysemu/kvm.h"
> >  #include "kvm_arm.h"
> >
> > +#ifdef DEBUG
> > +# define DPRINTF(format, ...) printf("armcpu: " format , ## __VA_ARGS__)
> > +#else
> > +# define DPRINTF(format, ...) do { } while (0)
> > +#endif
> 
> Drop these debug macros, please.
or if it's useful then use trace-points instead, which could be turned on/off
without recompiling.

> 
> > +
> >  static void arm_cpu_set_pc(CPUState *cs, vaddr value)
> >  {
> >      ARMCPU *cpu = ARM_CPU(cs);
> > @@ -388,12 +394,23 @@ static void arm_cpu_initfn(Object *obj)
> >      CPUState *cs = CPU(obj);
> >      ARMCPU *cpu = ARM_CPU(obj);
> >      static bool inited;
> > +    uint32_t Aff1, Aff0;
> >
> >      cs->env_ptr = &cpu->env;
> >      cpu_exec_init(&cpu->env);
> >      cpu->cp_regs = g_hash_table_new_full(g_int_hash, g_int_equal,
> >                                           g_free, g_free);
> >
> > +    /*
> > +     * We don't support setting cluster ID ([16..23]) (known as Aff2
> > +     * in later ARM ARM versions), or any of the higher affinity level 
> > fields,
> > +     * so these bits always RAZ.
> > +     */
> > +    Aff1 = cs->cpu_index / CPUS_PER_CLUSTER;
> > +    Aff0 = cs->cpu_index % CPUS_PER_CLUSTER;
> > +    cpu->mpidr = (Aff1 << 8) | Aff0;
> > +    DPRINTF("Init(%p): index %d mpidr 0x%jX\n", cpu, cs->cpu_index, 
> > cpu->mpidr);
> > +
> >  #ifndef CONFIG_USER_ONLY
> >      /* Our inbound IRQ and FIQ lines */
> >      if (kvm_enabled()) {
> > diff --git a/target-arm/helper.c b/target-arm/helper.c
> > index 1cc4993..99c7a30 100644
> > --- a/target-arm/helper.c
> > +++ b/target-arm/helper.c
> > @@ -2063,12 +2063,9 @@ static const ARMCPRegInfo strongarm_cp_reginfo[] = {
> >
> >  static uint64_t mpidr_read(CPUARMState *env, const ARMCPRegInfo *ri)
> >  {
> > -    CPUState *cs = CPU(arm_env_get_cpu(env));
> > -    uint32_t mpidr = cs->cpu_index;
> > -    /* We don't support setting cluster ID ([8..11]) (known as Aff1
> > -     * in later ARM ARM versions), or any of the higher affinity level 
> > fields,
> > -     * so these bits always RAZ.
> > -     */
> > +    ARMCPU *cpu = ARM_CPU(arm_env_get_cpu(env));
> > +    uint64_t mpidr = cpu->mpidr;
> > +
> >      if (arm_feature(env, ARM_FEATURE_V7MP)) {
> >          mpidr |= (1U << 31);
> >          /* Cores which are uniprocessor (non-coherent)
> > diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
> > index 93c1ca8..99fa34e 100644
> > --- a/target-arm/kvm64.c
> > +++ b/target-arm/kvm64.c
> > @@ -25,6 +25,12 @@
> >  #include "internals.h"
> >  #include "hw/arm/arm.h"
> >
> > +#ifdef DEBUG
> > +# define DPRINTF(format, ...) printf("arm-kvm64: " format , ## __VA_ARGS__)
> > +#else
> > +# define DPRINTF(format, ...) do { } while (0)
> > +#endif
> >
> >  static inline void set_feature(uint64_t *features, int feature)
> >  {
> >      *features |= 1ULL << feature;
> > @@ -77,6 +83,10 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUClass *ahcc)
> >      return true;
> >  }
> >
> > +#define ARM_MPIDR_HWID_BITMASK 0xFF00FFFFFFULL
> > +#define ARM_CPU_ID             3, 0, 0, 0
> > +#define ARM_CPU_ID_MPIDR       5
> 
> I don't see the point in separating the op2 value from the rest.
> 
> > +
> >  int kvm_arch_init_vcpu(CPUState *cs)
> >  {
> >      int ret;
> > @@ -107,6 +117,21 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >          return ret;
> >      }
> >
> > +    /*
> > +     * When KVM is in use, psci is emulated in-kernel and not by qemu.
> 
> PSCI and QEMU should both be all-caps.
> 
> > +     * In order for it to work correctly we should use correct MPIDR 
> > values,
> > +     * which appear to be inherited from the host.
> 
> I don't think they're inherited from the host: it's just that
> different host kernels use different mappings from vcpu ID
> to MPIDR (see reset_mpidr() in arch/arm64/kvm/sys_regs.c; its
> implementation has changed over time).
> In any case we don't need to speculate; you can just say:
> 
>  In order for it to work correctly we must use MPIDR values in
>  the device tree which match the MPIDR values the kernel has picked
>  for the vcpus, so ask KVM what those values are.
Could we set QEMU's generated mpidr in kernel instead of pulling it from kernel,
like we do with APIC ID in x86 and fix kernel not to reset it its own value
(i.e. untie mpidr from vcpuid)?

Then later we could move setting mpidr from kvm_arch_init_vcpu() into
board code which should be setting it, since it knows/defines what topology it 
has.

> > +     * So we override our defaults by asking KVM about these values.
> > +     */
> > +    ret = kvm_get_one_reg(cs, ARM64_SYS_REG(ARM_CPU_ID, ARM_CPU_ID_MPIDR),
> > +                          &cpu->mpidr);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +    cpu->mpidr &= ARM_MPIDR_HWID_BITMASK;
> > +    DPRINTF("Init(%p): index %d MPIDR 0x%jX\n", cpu,
> > +            cs->cpu_index, cpu->mpidr);
> > +
> >      return kvm_arm_init_cpreg_list(cpu);
> >  }
> >
> 
> There should also be an equivalent patch to kvm32.c, because
> PSCI on 32-bit systems has the same issue.
> 
> > diff --git a/target-arm/psci.c b/target-arm/psci.c
> > index d8fafab..d03896f 100644
> > --- a/target-arm/psci.c
> > +++ b/target-arm/psci.c
> > @@ -72,6 +72,22 @@ bool arm_is_psci_call(ARMCPU *cpu, int excp_type)
> >      }
> >  }
> >
> > +static uint32_t mpidr_to_idx(uint64_t mpidr)
> > +{
> > +    uint32_t Aff1, Aff0;
> > +
> > +    /*
> > +     * MPIDR_EL1 [RES0:affinity-3:RES1:U:RES0:MT:affinity-1:affinity-0]
> > +     * GIC 500 code currently supports 32 clusters with 8 cores each
> > +     * but no more than 128 cores. Future version will have flexible
> > +     * affinity selection
> > +     * GIC 400 supports 8 cores so 0x7 for Aff0 is O.K. too
> > +     */
> > +     Aff1 = (mpidr & 0xff00) >> 8;
> > +     Aff0 = mpidr & 0xff;
> > +     return Aff1 * CPUS_PER_CLUSTER + Aff0;
> > +}
> 
> > +
> >  void arm_handle_psci_call(ARMCPU *cpu)
> >  {
> >      /*
> > @@ -121,7 +137,7 @@ void arm_handle_psci_call(ARMCPU *cpu)
> >
> >          switch (param[2]) {
> >          case 0:
> > -            target_cpu_state = qemu_get_cpu(mpidr & 0xff);
> > +            target_cpu_state = qemu_get_cpu(mpidr_to_idx(mpidr));
> 
> Rather than doing this, you can just have an
> arm_get_cpu_by_mpidr() which does
>     CPU_FOREACH(cs) {
>         ARMCPU *cpu = ARM_CPU(cs);
> 
>         if (cpu->mpidr == mpidr) {
>             return cpu;
>         }
>     }
>     return NULL;
> 
> ie same as qemu_get_cpu() but checking on mpidr rather than
> cpu_index. Then we don't need to make assumptions about the
> mpidr-to-index mapping here at all (and this code will
> continue to work if in future we decide to expose the
> cluster topology via CPU properties for the board to set.)
> 
> thanks
> -- PMM




reply via email to

[Prev in Thread] Current Thread [Next in Thread]