[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC 2/3] x86: cphp: prevent guest crash on CPU hotplug when broadca
From: |
Laszlo Ersek |
Subject: |
Re: [RFC 2/3] x86: cphp: prevent guest crash on CPU hotplug when broadcast SMI is in use |
Date: |
Mon, 20 Jul 2020 19:29:23 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 Thunderbird/52.9.1 |
On 07/17/20 14:57, Igor Mammedov wrote:
> On Tue, 14 Jul 2020 12:56:50 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
>
>> On 07/10/20 18:17, Igor Mammedov wrote:
> [...]
>
>>> @@ -1508,6 +1508,17 @@ static void pc_cpu_pre_plug(HotplugHandler
>>> *hotplug_dev,
>>> return;
>>> }
>>>
>>> + if (pcms->acpi_dev) {
>>> + Error *local_err = NULL;
>>> +
>>> + hotplug_handler_pre_plug(HOTPLUG_HANDLER(pcms->acpi_dev), dev,
>>> + &local_err);
>>> + if (local_err) {
>>> + error_propagate(errp, local_err);
>>> + return;
>>> + }
>>> + }
>>> +
>>> init_topo_info(&topo_info, x86ms);
>>>
>>> env->nr_dies = x86ms->smp_dies;
>>>
>>
>> (6) This looks sane to me, but I have a question for the *pre-patch*
>> code.
>
> (not so short introduction)
>
> plug callbacks were designed for wiring up hotplugged device, hence
> it has check for acpi_dev which is one of HW connections needed to make
> it work. Later on coldplug was consolidated with it, so plug callbacks
> are also handling coldplugged devices.
>
> then later plug callback was split on pre_ and plug_, where pre_
> part is called right before device_foo.realize() and plug_ right after.
> Split was intended to make graceful failure easier, where pre_ is taking
> care of checking already set properties and optionally setting additional
> properties and can/should fail without side-effects, and plug_ part
> should not fail (maybe there is still cleanup to do there) and used to
> finish wiring after the device had been realized.
>
>
>>
>> I notice that hotplug_handler_pre_plug() is already called from the
>> (completely unrelated) function pc_memory_pre_plug().
>>
>> In pc_memory_pre_plug(), we have the following snippet:
>>
>> /*
>> * When -no-acpi is used with Q35 machine type, no ACPI is built,
>> * but pcms->acpi_dev is still created. Check !acpi_enabled in
>> * addition to cover this case.
>> */
>> if (!pcms->acpi_dev || !x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
>> error_setg(errp,
>> "memory hotplug is not enabled: missing acpi device or
>> acpi disabled");
>> return;
>> }
>>
>> Whereas in pc_cpu_pre_plug(), the present patch only adds a
>> "pcms->acpi_dev" nullity check.
>>
>> Should pc_cpu_pre_plug() check for ACPI enablement similarly to
>> pc_memory_pre_plug()?
> extra check is not must have in pc_memory_pre_plug() as it should not break
> anything
> (modulo MMIO memhp interface, which went unnoticed since probably nobody
> uses MMIO memhp registers directly (i.e. QEMU's ACPI tables is sole user))
>
>> I'm asking for two reasons:
>>
>> (6a) for the feature at hand (CPU hotplug with SMI), maybe we should
>> write:
>>
>> if (pcms->acpi_dev && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
> pretty harmless in the same sense as in pc_memory_pre_plug(),
> but I'd rather avoid checks that are not must have.
>
>
>> (6b) or maybe more strictly, copy the check from memory hotplug (just
>> update the error message):
>>
>> if (!pcms->acpi_dev || !x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
>> error_setg(errp,
>> "CPU hotplug is not enabled: missing acpi device or acpi
>> disabled");
> can't do it as it will break CPU clodplug when -no-cpi is used
>
>> return;
>> }
>>
>> Because CPU hotplug depends on ACPI too, just like memory hotplug,
>> regardless of firmware, and regardless of guest-SMM. Am I correct to
>> think that?
>
> We have pcms->acpi_dev check in x86 code because isapc/piix4 machines
> will/might not create the pm device (which implements acpi hw).
>
> (1) if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms)))
>
> if that weren't the case, calls to acpi_dev would be unconditional
>
> I'm adding check into pre_plug so we can gracefully reject device_add
> in case firmware is not prepared for handling hotplug SMI. Since
> the later might crash the guest. It's for the sake of better user
> experience since QEMU might easily run older firmware.
>
> If we don't care about that, we can drop negotiation and the check,
> send SMI on hotplug when SMI broadcast is enabled, that might
> crash guest since it might be not supported by used fw, but that
> was already the case for quite a while and I've heard only few
> complaints. (I guess most users have sense not to use something
> not impl./supported)
>
>
>> Basically, I'm asking if we should replicate original commit
>> 8cd91acec8df ("pc: fail memory hot-plug/unplug with -no-acpi and Q35
>> machine type", 2018-01-12) for CPU hotplug first (in a separate patch!),
>> before dealing with "lpc->smi_negotiated_features" in this patch.
>
> I'd rather leave hw part decoupled from acpi tables part,
> nothing prevents users implementing their own fw/os which uses
> our cpuhp interface directly without ACPI.
>
>> Hmm... I'm getting confused. I *do* see similar checks in pc_cpu_plug()
>> and pc_cpu_unplug_request_cb(). But:
>>
>> - I don't understand what determines whether we put the ACPI check in
>> *PRE* plug functions, or the plug functions,
> I beleive [1] answers that
>
>> - and I don't understand why pc_cpu_plug() and
>> pc_cpu_unplug_request_cb() only check "pcms->acpi_dev", and not
>> x86_machine_is_acpi_enabled().
>
> x86_machine_is_acpi_enabled() is not must have in this case as
> callbacks implement only hw part of cpuhp protocol (QEMU part),
> what guest uses to handle it (fw tables(qemu generated),
> or some custom code) is another story.
Thank you for the explanation!
Laszlo
- [RFC 0/3] x86: fix cpu hotplug with secure boot, Igor Mammedov, 2020/07/10
- [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Igor Mammedov, 2020/07/10
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Laszlo Ersek, 2020/07/14
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Laszlo Ersek, 2020/07/14
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Igor Mammedov, 2020/07/14
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Laszlo Ersek, 2020/07/15
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Igor Mammedov, 2020/07/15
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Laszlo Ersek, 2020/07/16
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Igor Mammedov, 2020/07/17
- Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs, Laszlo Ersek, 2020/07/20
[RFC 1/3] x86: lpc9: let firmware negotiate CPU hotplug SMI feature, Igor Mammedov, 2020/07/10