qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SM


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
Date: Thu, 24 Nov 2016 01:01:58 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.0

CC Jordan & Mike

On 11/23/16 23:35, Paolo Bonzini wrote:
> 
> 
> On 18/11/2016 11:36, Laszlo Ersek wrote:
>> This is v3 of the series, with updates based on the v2 discussion:
>> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
>>
>> I've added feature negotiation via the APM_STS ("scratchpad") register.
>> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
>>
>> Tested with new OVMF patches (about to send out those as well).
>> Regression tested with SeaBIOS (beyond simple functional tests with
>> maximum SeaBIOS logging enabled, I used gdb to step through the new
>> ich9_apm_status_changed() callback to see if it was behaving compatibly
>> with SeaBIOS).
>>
>> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
>> crashes very quickly for me when running OVMF:
>>
>>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
>>
>> It is my understanding that there are patches on the list for this:
>>
>>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
>>
>> Anyway, the series rebases to v2.8.0-rc0 without as much as context
>> differences.
> 
> Hi Laszlo,
> 
> sorry for the slightly delayed reply.
> 
> First of all, I'm wondering if we would be better off adding a new port
> 0xB1 that is QEMU-specific, instead of reusing 0xB3.

Sure, I can look into that, if we agree that's the best way to proceed,
for now. (Although I'm not really happy about the new memory region
stuff it would require. :()

I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
in SeaBIOS.

BTW, what happens in QEMU if the guest reads an unimplemented port?
Hm... unassigned_io_write() seems to be a no-op, and
unassigned_io_read() returns all-bits-one. This means that for a new
port, the negotiation protocol / values have to be reworked.

Port 0xB1 is occupied by ICH9 according to the spec:

  Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)

  I/O
  Address  Read Target           Write Target          Internal Unit
  -------  --------------------  --------------------  -------------
  B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt

I wonder if we care -- after all, APM_STS (0xB3) is documented not to
have any hardware effects ("scratchpad register").

> Second, I now remembered the reason why I was against broadcast SMI.
> The reason is that it breaks hot-plug.

How does it break hot-plug? After reading your explanation below: is it
that the broadcast SMI (possibly raised by the OS directly) would get to
the new VCPU before the firmware relocated its SMBASE?

> 
> On hot-plug, the firmware (if it wants to use SMI for anything secure)
> must relocate the SMBASE of the newly-hotplugged CPU before the OS has
> any chance to put its fangs on it.  This is because the OS can send
> direct SMIs and is in control of the area at 0x30000.  So OVMF is
> already broken with respect to hotplug,

Yes. We theorized that there could be further edk2 core modules that
hadn't been open sourced yet, necessary for handling VCPU hotplug.

> but I am not yet sure if these
> patches would break it further.

Hard to say without seeing those modules.

I will speculate: assuming that the non-public modules fit together with
the public modules in some way, I expect the broadcast SMI shouldn't
break them. The reason is that the broadcast SMI / traditional delivery
are the default method in UefiCpuPkg, and in practice they work better
(more reliably) with the rest of the edk2 infrastructure, in my
experience, than the relaxed sync method.

In his review today, Jordan wrote
<https://lists.01.org/pipermail/edk2-devel/2016-November/005088.html>,

> I'm glad we'll be using a mechanism that broadcasts to all the
> processors like the real hardware. It is a bit unfortunate that it
> doesn't go through the b2 port for it.

If broadcast is how real hardware does it (even by default!,
apparently), I expect those as-yet unreleased, hotplug-handling modules
in edk2 should be able to cope with broadcast.

> The full solution is to follow a protocol similar to what real hardware
> does.

Real hardware seems to use broadcast, according to the above...

On 11/23/16 23:35, Paolo Bonzini wrote:

> On hot-plug, before the new CPU starts running, the DSDT should
> generate an SMI (with a well-known value written to 0xB2).

I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
that's just another way to say, "the firmware vendor asks the operating
system to write to 0xB2". If the malicious OS is smart enough, it can
realize (from the hardware signal to run the ACPI GPE handler, IIRC)
that a new VCPU is available, and simply not trigger the SMI.

> The handler
> then:
> 
> 1) waits for SMM rendezvous
> 
> 2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
> react to either INITs or of course SIPIs (this would need to be
> implemented separately for both TCG and KVM! but only in QEMU, not in
> the kernel).

Okay, this does plug the hole I sketched out above. This logic (the
QEMU-specific unparking) can be done in another platform API in OVMF I
guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
provide the infrastructure in platform code up to the separate SMI
command handler. I think it again depends on those unreleased modules.

> 3) relocates SMBASE
> 
> 4) records the presence of the new VCPU's APIC id for subsequent SMIs.
> 
> 
> The other important things are:
> 
> * new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt)
> interfaces.  I think this would only need a new bit in the write
> register at 0x4 ("CPU device control fields").
> 
> The 0x0-0x3 write register, currently reserved for reading, might
> become read/write for easier communication with the SMI handler.  The
> SMI handler would write 1 to the new bit in order to unpark the VCPU.
> 
> * how to enable this.  I think it would need a new SMM feature, just
> like those that you are adding here, in order to make it negotiable.  If
> it is not negotiated, but broadcast SMI is negotiated, you'd need to do
> something such as not allowing CPU-hotplug.  (This is the only part that
> I think is required for 2.9).

My hope is that we can work on this incrementally (or at least that we
don't forfeit our chance at SMM + VCPU hotplug) by going down the
broadcast SMI route first. Based on what Jordan wrote, this hope should
not be unfounded.

Also, the SMM stack (across components: core edk2 code, OVMF platform
code, QEMU and KVM) is now more than a year old, and we still have
stability problems (hence this series). I'd like to stabilize the base
features before getting wild with hotplug.

(The fact that OVMF lags behind SeaBIOS in a number of features is just
business as usual.)

So, I'd vastly prefer if I needed to extend this series only with a way
to prevent VCPUs from being hotplugged. I think negotiating SMI
broadcast should be good enough for that, as a hook point, given that it
runs in the firmware before the OS gets control (on S3 resume too).

I guess a global variable (or a board-level query function) that would
cause pc_query_hotpluggable_cpus() to return an empty list would be
frowned upon; what's the recommended method?

There is DeviceClass.hotpluggable, but that's not a dynamic property.

> 
> In order to trigger the SMI, the
> 
>                  ifctx = aml_if(aml_equal(ins_evt, one));
>                  {
>                      aml_append(ifctx,
>                          aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
>                      aml_append(ifctx, aml_store(one, ins_evt));
>                      aml_append(ifctx, aml_store(one, has_event));
>                  }
> 
> would be replaced by something like this pseudo-AML:
> 
>               Store(And(smm_features, 2), Local1)
>               ...
>               Store(next_cpu_cmd, cpu_cmd)
>               If (Equal(ins_evt, One)) {
>                       If (Greater(Local1, Zero)) {
>                               Store(CPU_HP_APM_CMD, apm_cmd)
>                       }
>                       CPU_NOTIFY_METHOD(cpu_data, dev_chk)
>                       Store(One, ins_evt)
>                       Store(One, has_event)
>               }
> 
> 
> Of course all this is for OVMF only.  SeaBIOS doesn't need to do
> anything of this, because it actually likes to have its SMIs only on the
> current CPU (and it had better be the BSP, since SMBASE is not relocated
> elsewhere!).
> 
> Igor, any thoughts?
> 
> I understand that this is quite huge in both OVMF and QEMU, but we've
> only been delaying it and we knew about it. :(

I agree about being aware of lack of VCPU hotplug support wrt. SMM.

I disagree about delaying it. I've been aware of problems with the base
SMM features (of differing importance), for example

- that S3 resume with the Ia32 build of OVMF was almost hit-or-miss,

- or that the heavy use of UEFI variable services during Windows 8 boot
caused user-visible choppiness in the rotating animation (because of how
our SMIs worked) -- I think I even mentioned this to you.

Since the stalemate in

  http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00125.html

from last November, I've never stopped disliking the lack of broadcast
SMIs, but I couldn't do anything about it, despite it leaving the base
feature set unreliable.

Recently, the open source edk2 SMM stack has acquired a bunch of new
code (you're aware), which exacerbate the instabilities (as I've
documented on the edk2-devel list in excruciating detail). I was
overjoyed when you finally suggested (!) to add broadcast SMI, getting
the discussion un-stuck, because (a) it miraculously fixes everything,
and (b) honestly, I see no other way from keeping the SMM stack from
falling apart otherwise, *without* VCPU hotplug in the picture.

For the last few weeks, I've been working day and night (I'm not
exaggerating) on testing & reviewing edk2 patches related to SMM or even
just multiprocessing, catching errors, reporting them, and even
debugging and fixing them myself. (See for example commits

  00650c531a90 UefiCpuPkg/MpInitLib/X64/MpFuncs.nasm: fix fatal typo
  4af3ae146379 UefiCpuPkg/LocalApicLib: fix feature test for Extended
               Topology CPUID leaf
  1cbd83308989 UefiCpuPkg/MpInitLib: fix feature test for Extended
               Topology CPUID leaf

). Putting out fires more or less as a norm.

Furthermore, I haven't been pretending that VCPU hotplug "doesn't
exist", see

  https://lists.01.org/pipermail/edk2-devel/2016-November/005131.html

for example, which I posted independently, ~8 hours ago.

So, no. I reject the idea that we've been procrastinating SMM enabled
VCPU hotplug. We can't build the second floor while the foundation is
shaking. And then we need to ask Intel to release more code as open source.

Thanks
Laszlo

> 
> Paolo
> 
>> Cc: "Kevin O'Connor" <address@hidden>
>> Cc: "Michael S. Tsirkin" <address@hidden>
>> Cc: Gerd Hoffmann <address@hidden>
>> Cc: Paolo Bonzini <address@hidden>
>>
>> Thanks
>> Laszlo
>>
>> Laszlo Ersek (3):
>>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
>>
>>  docs/specs/q35-apm-sts.txt | 80 
>> ++++++++++++++++++++++++++++++++++++++++++++++
>>  include/hw/i386/ich9.h     |  9 ++++++
>>  include/hw/isa/apm.h       |  9 +++---
>>  hw/acpi/piix4.c            |  2 +-
>>  hw/isa/apm.c               | 15 ++++++---
>>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
>>  hw/isa/vt82c686.c          |  2 +-
>>  7 files changed, 168 insertions(+), 13 deletions(-)
>>  create mode 100644 docs/specs/q35-apm-sts.txt
>>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]