[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast
From: |
Igor Mammedov |
Subject: |
Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled |
Date: |
Fri, 27 Nov 2020 12:33:56 +0100 |
On Thu, 26 Nov 2020 19:35:30 -0800
Ankur Arora <ankur.a.arora@oracle.com> wrote:
> On 2020-11-26 4:46 a.m., Laszlo Ersek wrote:
> > On 11/26/20 11:24, Ankur Arora wrote:
> >> On 2020-11-24 4:25 a.m., Igor Mammedov wrote:
> >>> If firmware negotiates ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT feature,
> >>> OSPM on CPU eject will set bit #4 in CPU hotplug block for to be
> >>> ejected CPU to mark it for removal by firmware and trigger SMI
> >>> upcall to let firmware do actual eject.
> >>>
> >>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> >>> ---
> >>> PS:
> >>> - abuse 5.1 machine type for now to turn off unplug feature
> >>> (it will be moved to 5.2 machine type once new merge window is open)
> >>> ---
> >>> include/hw/acpi/cpu.h | 2 ++
> >>> docs/specs/acpi_cpu_hotplug.txt | 11 +++++++++--
> >>> hw/acpi/cpu.c | 18 ++++++++++++++++--
> >>> hw/i386/acpi-build.c | 5 +++++
> >>> hw/i386/pc.c | 1 +
> >>> hw/isa/lpc_ich9.c | 2 +-
> >>> 6 files changed, 34 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
> >>> index 0eeedaa491..999caaf510 100644
> >>> --- a/include/hw/acpi/cpu.h
> >>> +++ b/include/hw/acpi/cpu.h
> >>> @@ -22,6 +22,7 @@ typedef struct AcpiCpuStatus {
> >>> uint64_t arch_id;
> >>> bool is_inserting;
> >>> bool is_removing;
> >>> + bool fw_remove;
> >>> uint32_t ost_event;
> >>> uint32_t ost_status;
> >>> } AcpiCpuStatus;
> >>> @@ -50,6 +51,7 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object
> >>> *owner,
> >>> typedef struct CPUHotplugFeatures {
> >>> bool acpi_1_compatible;
> >>> bool has_legacy_cphp;
> >>> + bool fw_unplugs_cpu;
> >>> const char *smi_path;
> >>> } CPUHotplugFeatures;
> >>> diff --git a/docs/specs/acpi_cpu_hotplug.txt
> >>> b/docs/specs/acpi_cpu_hotplug.txt
> >>> index 9bb22d1270..f68ef6e06c 100644
> >>> --- a/docs/specs/acpi_cpu_hotplug.txt
> >>> +++ b/docs/specs/acpi_cpu_hotplug.txt
> >>> @@ -57,7 +57,11 @@ read access:
> >>> It's valid only when bit 0 is set.
> >>> 2: Device remove event, used to distinguish device for which
> >>> no device eject request to OSPM was issued.
> >>> - 3-7: reserved and should be ignored by OSPM
> >>> + 3: reserved and should be ignored by OSPM
> >>> + 4: if set to 1, OSPM requests firmware to perform device
> >>> eject,
> >>> + firmware shall clear this event by writing 1 into it
> >>> before
> >>> + performing device eject> + 5-7: reserved and
> >>> should be ignored by OSPM
> >>> [0x5-0x7] reserved
> >>> [0x8] Command data: (DWORD access)
> >>> contains 0 unless value last stored in 'Command field' is
> >>> one of:
> >>> @@ -82,7 +86,10 @@ write access:
> >>> selected CPU device
> >>> 3: if set to 1 initiates device eject, set by OSPM when it
> >>> triggers CPU device removal and calls _EJ0 method
> >>> - 4-7: reserved, OSPM must clear them before writing to
> >>> register
> >>> + 4: if set to 1 OSPM hands over device eject to firmware,
> >>> + Firmware shall issue device eject request as described
> >>> above
> >>> + (bit #3) and OSPM should not touch device eject bit (#3),
> >>> + 5-7: reserved, OSPM must clear them before writing to
> >>> register
> >>> [0x5] Command field: (1 byte access)
> >>> value:
> >>> 0: selects a CPU device with inserting/removing events and
> >>> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> >>> index f099b50927..09d2f20dae 100644
> >>> --- a/hw/acpi/cpu.c
> >>> +++ b/hw/acpi/cpu.c
> >>> @@ -71,6 +71,7 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr
> >>> addr, unsigned size)
> >>> val |= cdev->cpu ? 1 : 0;
> >>> val |= cdev->is_inserting ? 2 : 0;
> >>> val |= cdev->is_removing ? 4 : 0;
> >>> + val |= cdev->fw_remove ? 16 : 0;
> >>
> >> I might be missing something but I don't see where cdev->fw_remove is being
> >> set.
> >
> > See just below, in the cpu_hotplug_wr() hunk. When bit#4 is written --
> > which happens through the ACPI code change --, fw_remove is inverted.
> Thanks that makes sense. I was reading the AML building code all wrong.
>
> >
> >
> >> We do set cdev->is_removing in acpi_cpu_unplug_request_cb() so AFAICS
> >> we would always end up setting this bit:
> >>> val |= cdev->is_removing ? 4 : 0;
> >>
> >> Also, if cdev->fw_remove and cdev->is_removing are both true, val would be
> >> (4 | 16). I'm guessing that in that case the AML determines which case gets
> >> handled but it might make sense to set just one of these?
> >
> > "is_removing" is set directly in response to the device_del QMP command.
> > That QMP command is asynchronous to the execution of the guest OS.
> > j
> > "fw_remove" is set (by virtue of inverting) by ACPI CEJ0, which is
> > executed by the guest OS's ACPI interpreter, after the guest OS has
> > de-scheduled all processes from the CPU being removed (= basically after
> > the OS has willfully forgotten about the CPU).
> >
> > Therefore, considering the bitmask (is_removing, fw_remove), three
> > variations make sense:
>
> Just annotating these with the corresponding ACPI code to make sure
> I have it straight. Please correct if my interpretation is wrong. Also,
> a few questions inline:
>
> >
> > #1 (is_removing=0, fw_remove=0) -- normal status; no unplug requested
> >
> > #2 (is_removing=1, fw_remove=0) -- unplug requested via QMP, guest OS
> > is processing the request
>
> Guest executes the CSCN method and reads rm_evt (bit 2) (thus noticing
> the is_removing=1), and then notifies the CPU to be removed via the
> CTFY method.
>
> ifctx = aml_if(aml_equal(rm_evt, one));
> {
> aml_append(ifctx,
> aml_call2(CPU_NOTIFY_METHOD, uid, eject_req));
> aml_append(ifctx, aml_store(one, rm_evt));
> aml_append(ifctx, aml_store(one, has_event));
> }
>
> Then it does a store to rm_evt (bit 2). That would result in clearing
> of is_removing. (Igor mentions that in a separate mail.)
>
> 1. Do we need to clear is_removing at all? AFAICS, it's only useful as
> an ack to QEMU and I can't think of why that's useful. OTOH it
> doesn't serve any useful purpose once the guest OS has seen the request.
no firmware doesn't need to care about it, it's consumed by OSPM only
> 2. Would it make sense to clear it first and then call CPU_NOTIFY_METHOD?
> CPU_NOTIFY_METHOD (or _EJ0, COST) don't depend on is_removing but
> that might change in the future.
all methods are protected by be same mutex, so if _EJ0 is called while CSCN
in progress it will wait till CSCN is finished.
But clearing bit #2 before Notify should work too.
> The notify would end up in calling acpi_hotplug_schedule() which would be
> responsible for queuing work (on CPU0) to detach+unplug the CPU.
>
> Once the OS level detach succeeds, the worker evaluates the "_EJ0" method
> which would do the actual CPU_EJECT_METHOD work.
>
> If the detach fails then it evaluates the CPU_OST_METHOD which updates
> the status for the event and the status.
>
> At this point the state is back to:
>
> (is_removing=0, fw_remove=0)
if OSPM fails to release CPU for whatever reasons, it's valid
state, we just notify user using OST event that requested unplug wasn't
successful.
>
> > #3 (is_removing=1, fw_remove=1) -- guest OS removed all references from
> > the CPU, firmware is permitted /
> > required to forget about the CPU as
> > well, and then unplug the CPU
>
> CPU_EJECT_METHOD will do a store to bit 4, which would invert (and
> thus set) fw_remove and then do the SMI.
>
> So, this would be
> > #3 (is_removing=0, fw_remove=1)
>
> At this point the firmware calls QemuCPUhpCollectApicIds() which
> (after changes) notices CPU(s) with fw_remove set.
>
> Collects them and does a store to bit 4, which would clear fw_remove.
I'd skip this step on firmware side and make QEMU clear it
when CPU is ejected.
>
> >
> > #4 (is_removing=1, fw_remove=0) -- fimware is about to unplug the CPU
> >
> > #5 (is_removing=0, fw_remove=0) -- firmware performing unplug
> Firmware does an unplug and writes to bit 3, thus clearing is_removing.
>
> On return from the firmware the guest evaluates the COST again.
it's optional and depends on OSPM implementation (some do not call it on
success)
> And, eventually goes back to the CSCN where it processes more
> hotplug or unplug events.
CSCN in case of unplug finishes first, and only after that EJ0 calls
are processed
> > The variation (is_removing=0, fw_remove=1) is invalid / unused.
>
> /nods
> >
> >
> > The firmware may be investigating the CPU register block between steps
> > #2 and #3 -- in other words, the firmware may see a CPU for which
> > is_remove is set (unplug requested via QMP), but the OS has not vacated
> > yet (fw_remove=0). In that case, the firmware must just skip the CPU --
> > once the OS is done, it will set fw_remove too, and raise another SMI.
> Yeah, it makes sense for the firmware to only care about a CPU once it
> sees fw_remove=1. (And as currently situated, the firmware would never
> see is_removing=1 at all.)
>
>
> Thanks
> Ankur
>
> >
> >
> >>
> >>
> >>> trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
> >>> break;
> >>> case ACPI_CPU_CMD_DATA_OFFSET_RW:
> >>> @@ -148,6 +149,8 @@ static void cpu_hotplug_wr(void *opaque, hwaddr
> >>> addr, uint64_t data,
> >>> hotplug_ctrl = qdev_get_hotplug_handler(dev);
> >>> hotplug_handler_unplug(hotplug_ctrl, dev, NULL);
> >>> object_unparent(OBJECT(dev));
> >>> + } else if (data & 16) {
> >>> + cdev->fw_remove = !cdev->fw_remove;
> >>> }
> >>> break;
> >>> case ACPI_CPU_CMD_OFFSET_WR:
> >>> @@ -332,6 +335,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
> >>> #define CPU_INSERT_EVENT "CINS"
> >>> #define CPU_REMOVE_EVENT "CRMV"
> >>> #define CPU_EJECT_EVENT "CEJ0"
> >>> +#define CPU_FW_EJECT_EVENT "CEJF"
> >>> void build_cpus_aml(Aml *table, MachineState *machine,
> >>> CPUHotplugFeatures opts,
> >>> hwaddr io_base,
> >>> @@ -384,7 +388,10 @@ void build_cpus_aml(Aml *table, MachineState
> >>> *machine, CPUHotplugFeatures opts,
> >>> aml_append(field, aml_named_field(CPU_REMOVE_EVENT, 1));
> >>> /* initiates device eject, write only */
> >>> aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1));
> >>> - aml_append(field, aml_reserved_field(4));
> >>> + aml_append(field, aml_reserved_field(1));
> >>> + /* tell firmware to do device eject, write only */
> >>> + aml_append(field, aml_named_field(CPU_FW_EJECT_EVENT, 1));
> >>> + aml_append(field, aml_reserved_field(2));
> >>> aml_append(field, aml_named_field(CPU_COMMAND, 8));
> >>> aml_append(cpu_ctrl_dev, field);
> >>> @@ -419,6 +426,7 @@ void build_cpus_aml(Aml *table, MachineState
> >>> *machine, CPUHotplugFeatures opts,
> >>> Aml *ins_evt = aml_name("%s.%s", cphp_res_path,
> >>> CPU_INSERT_EVENT);
> >>> Aml *rm_evt = aml_name("%s.%s", cphp_res_path,
> >>> CPU_REMOVE_EVENT);
> >>> Aml *ej_evt = aml_name("%s.%s", cphp_res_path,
> >>> CPU_EJECT_EVENT);
> >>> + Aml *fw_ej_evt = aml_name("%s.%s", cphp_res_path,
> >>> CPU_FW_EJECT_EVENT);
> >>> aml_append(cpus_dev, aml_name_decl("_HID",
> >>> aml_string("ACPI0010")));
> >>> aml_append(cpus_dev, aml_name_decl("_CID",
> >>> aml_eisaid("PNP0A05")));
> >>> @@ -461,7 +469,13 @@ void build_cpus_aml(Aml *table, MachineState
> >>> *machine, CPUHotplugFeatures opts,
> >>> aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
> >>> aml_append(method, aml_store(idx, cpu_selector));
> >>> - aml_append(method, aml_store(one, ej_evt));
> >>> + if (opts.fw_unplugs_cpu) {
> >>> + aml_append(method, aml_store(one, fw_ej_evt));
> >>> + aml_append(method,
> >>> aml_store(aml_int(OVMF_CPUHP_SMI_CMD),
> >>> + aml_name("%s", opts.smi_path)));
> >>> + } else {
> >>> + aml_append(method, aml_store(one, ej_evt));
> >>> + }
> >> My knowledge of AML is rather rudimentary but this looks mostly
> >> reasonable to me.
> >>
> >> One question: the corresponding code for CPU hotplug does not send an
> >> SMI_CMD.
> >> Why the difference?
> >
> > This code (on eject) is executing *after* the OS kernel has processed
> > the event. But on hotplug, the ordering is different (it must be): in
> > that case, the CSCN (scan) method first notifies the firmware, and then
> > the OS.
> >
> > Thanks
> > Laszlo
> >
> >>
> >> aml_append(while_ctx,
> >> aml_store(aml_derefof(aml_index(new_cpus,
> >> cpu_idx)),
> >> uid));
> >> aml_append(while_ctx,
> >> aml_call2(CPU_NOTIFY_METHOD, uid, dev_chk));
> >> aml_append(while_ctx, aml_store(uid, cpu_selector));
> >> aml_append(while_ctx, aml_store(one, ins_evt));
> >> aml_append(while_ctx, aml_increment(cpu_idx));
> >>
> >>
> >>> aml_append(method, aml_release(ctrl_lock));
> >>> }
> >>> aml_append(cpus_dev, method);
> >>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> >>> index 1f5c211245..475e76f514 100644
> >>> --- a/hw/i386/acpi-build.c
> >>> +++ b/hw/i386/acpi-build.c
> >>> @@ -96,6 +96,7 @@ typedef struct AcpiPmInfo {
> >>> bool s4_disabled;
> >>> bool pcihp_bridge_en;
> >>> bool smi_on_cpuhp;
> >>> + bool smi_on_cpu_unplug;
> >>> bool pcihp_root_en;
> >>> uint8_t s4_val;
> >>> AcpiFadtData fadt;
> >>> @@ -197,6 +198,7 @@ static void acpi_get_pm_info(MachineState
> >>> *machine, AcpiPmInfo *pm)
> >>> pm->pcihp_io_base = 0;
> >>> pm->pcihp_io_len = 0;
> >>> pm->smi_on_cpuhp = false;
> >>> + pm->smi_on_cpu_unplug = false;
> >>> assert(obj);
> >>> init_common_fadt_data(machine, obj, &pm->fadt);
> >>> @@ -220,6 +222,8 @@ static void acpi_get_pm_info(MachineState
> >>> *machine, AcpiPmInfo *pm)
> >>> pm->cpu_hp_io_base = ICH9_CPU_HOTPLUG_IO_BASE;
> >>> pm->smi_on_cpuhp =
> >>> !!(smi_features & BIT_ULL(ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT));
> >>> + pm->smi_on_cpu_unplug =
> >>> + !!(smi_features &
> >>> BIT_ULL(ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT));
> >>> }
> >>> /* The above need not be conditional on machine type because
> >>> the reset port
> >>> @@ -1582,6 +1586,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >>> CPUHotplugFeatures opts = {
> >>> .acpi_1_compatible = true, .has_legacy_cphp = true,
> >>> .smi_path = pm->smi_on_cpuhp ? "\\_SB.PCI0.SMI0.SMIC" :
> >>> NULL,
> >>> + .fw_unplugs_cpu = pm->smi_on_cpu_unplug,
> >>> };
> >>> build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
> >>> "\\_SB.PCI0", "\\_GPE._E02");
> >>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> >>> index 17b514d1da..2952a00fe6 100644
> >>> --- a/hw/i386/pc.c
> >>> +++ b/hw/i386/pc.c
> >>> @@ -99,6 +99,7 @@
> >>> GlobalProperty pc_compat_5_1[] = {
> >>> { "ICH9-LPC", "x-smi-cpu-hotplug", "off" },
> >>> + { "ICH9-LPC", "x-smi-cpu-hotunplug", "off" },
> >>> };
> >>> const size_t pc_compat_5_1_len = G_N_ELEMENTS(pc_compat_5_1);
> >>> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> >>> index 087a18d04d..8c667b7166 100644
> >>> --- a/hw/isa/lpc_ich9.c
> >>> +++ b/hw/isa/lpc_ich9.c
> >>> @@ -770,7 +770,7 @@ static Property ich9_lpc_properties[] = {
> >>> DEFINE_PROP_BIT64("x-smi-cpu-hotplug", ICH9LPCState,
> >>> smi_host_features,
> >>> ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT, true),
> >>> DEFINE_PROP_BIT64("x-smi-cpu-hotunplug", ICH9LPCState,
> >>> smi_host_features,
> >>> - ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT, false),
> >>> + ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT, true),
> >>> DEFINE_PROP_END_OF_LIST(),
> >>> };
> >>>
> >>
> >> Thanks for sending out the patch btw. This helped me crystallize some of
> >> the
> >> corresponding OVMF code.
> >>
> >> Ankur
> >>
> >
Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Ankur Arora, 2020/11/26
- Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Laszlo Ersek, 2020/11/26
- Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Ankur Arora, 2020/11/26
- Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled,
Igor Mammedov <=
- Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Laszlo Ersek, 2020/11/27
- Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Ankur Arora, 2020/11/27
- Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Laszlo Ersek, 2020/11/30
- Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Ankur Arora, 2020/11/30
Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled, Igor Mammedov, 2020/11/26