qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SM


From: Igor Mammedov
Subject: Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SMI feature
Date: Tue, 17 Jan 2017 15:20:56 +0100

On Tue, 17 Jan 2017 14:46:05 +0100
Laszlo Ersek <address@hidden> wrote:

> On 01/17/17 14:20, Igor Mammedov wrote:
> > On Tue, 17 Jan 2017 13:06:13 +0100
> > Laszlo Ersek <address@hidden> wrote:
> >   
> >> On 01/17/17 12:21, Igor Mammedov wrote:  
> >>> On Mon, 16 Jan 2017 21:46:55 +0100
> >>> Laszlo Ersek <address@hidden> wrote:
> >>>     
> >>>> On 01/13/17 14:09, Igor Mammedov wrote:    
> >>>>> On Thu, 12 Jan 2017 19:24:45 +0100
> >>>>> Laszlo Ersek <address@hidden> wrote:
> >>>>>       
> >>>>>> The generic edk2 SMM infrastructure prefers
> >>>>>> EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each 
> >>>>>> processor. If
> >>>>>> Trigger() only brings the current processor into SMM, then edk2 
> >>>>>> handles it
> >>>>>> in the following ways:
> >>>>>>
> >>>>>> (1) If Trigger() is executed by the BSP (which is guaranteed before
> >>>>>>     ExitBootServices(), but is not necessarily true at runtime), then:
> >>>>>>
> >>>>>>     (a) If edk2 has been configured for "traditional" SMM 
> >>>>>> synchronization,
> >>>>>>         then the BSP sends directed SMIs to the APs with APIC delivery,
> >>>>>>         bringing them into SMM individually. Then the BSP runs the SMI
> >>>>>>         handler / dispatcher.
> >>>>>>
> >>>>>>     (b) If edk2 has been configured for "relaxed" SMM synchronization,
> >>>>>>         then the APs that are not already in SMM are not brought in, 
> >>>>>> and
> >>>>>>         the BSP runs the SMI handler / dispatcher.
> >>>>>>
> >>>>>> (2) If Trigger() is executed by an AP (which is possible after
> >>>>>>     ExitBootServices(), and can be forced e.g. by "taskset -c 1
> >>>>>>     efibootmgr"), then the AP in question brings in the BSP with a
> >>>>>>     directed SMI, and the BSP runs the SMI handler / dispatcher.
> >>>>>>
> >>>>>> The smaller problem with (1a) and (2) is that the BSP and AP
> >>>>>> synchronization is slow. For example, the "taskset -c 1 efibootmgr"
> >>>>>> command from (2) can take more than 3 seconds to complete, because
> >>>>>> efibootmgr accesses non-volatile UEFI variables intensively.
> >>>>>>
> >>>>>> The larger problem is that QEMU's current behavior diverges from the
> >>>>>> behavior usually seen on physical hardware, and that keeps exposing
> >>>>>> obscure corner cases, race conditions and other instabilities in edk2,
> >>>>>> which generally expects / prefers a software SMI to affect all CPUs at
> >>>>>> once.
> >>>>>>
> >>>>>> Therefore introduce the "broadcast SMI" feature that causes QEMU to 
> >>>>>> inject
> >>>>>> the SMI on all VCPUs.
> >>>>>>
> >>>>>> While the original posting of this patch
> >>>>>> <http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html>
> >>>>>> only intended to speed up (2), based on our recent "stress testing" of 
> >>>>>> SMM
> >>>>>> this patch actually provides functional improvements.
> >>>>>>
> >>>>>> Cc: "Michael S. Tsirkin" <address@hidden>
> >>>>>> Cc: Gerd Hoffmann <address@hidden>
> >>>>>> Cc: Igor Mammedov <address@hidden>
> >>>>>> Cc: Paolo Bonzini <address@hidden>
> >>>>>> Signed-off-by: Laszlo Ersek <address@hidden>
> >>>>>> Reviewed-by: Michael S. Tsirkin <address@hidden>
> >>>>>> ---
> >>>>>>
> >>>>>> Notes:
> >>>>>>     v6:
> >>>>>>     - no changes, pick up Michael's R-b
> >>>>>>     
> >>>>>>     v5:
> >>>>>>     - replace the ICH9_LPC_SMI_F_BROADCAST bit value with the
> >>>>>>       ICH9_LPC_SMI_F_BROADCAST_BIT bit position (necessary for
> >>>>>>       DEFINE_PROP_BIT() in the next patch)
> >>>>>>
> >>>>>>  include/hw/i386/ich9.h |  3 +++
> >>>>>>  hw/isa/lpc_ich9.c      | 10 +++++++++-
> >>>>>>  2 files changed, 12 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
> >>>>>> index da1118727146..18dcca7ebcbf 100644
> >>>>>> --- a/include/hw/i386/ich9.h
> >>>>>> +++ b/include/hw/i386/ich9.h
> >>>>>> @@ -250,4 +250,7 @@ Object *ich9_lpc_find(void);
> >>>>>>  #define ICH9_SMB_HST_D1                         0x06
> >>>>>>  #define ICH9_SMB_HOST_BLOCK_DB                  0x07
> >>>>>>  
> >>>>>> +/* bit positions used in fw_cfg SMI feature negotiation */
> >>>>>> +#define ICH9_LPC_SMI_F_BROADCAST_BIT            0
> >>>>>> +
> >>>>>>  #endif /* HW_ICH9_H */
> >>>>>> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> >>>>>> index 376b7801a42c..ced6f803a4f2 100644
> >>>>>> --- a/hw/isa/lpc_ich9.c
> >>>>>> +++ b/hw/isa/lpc_ich9.c
> >>>>>> @@ -437,7 +437,15 @@ static void ich9_apm_ctrl_changed(uint32_t val, 
> >>>>>> void *arg)
> >>>>>>  
> >>>>>>      /* SMI_EN = PMBASE + 30. SMI control and enable register */
> >>>>>>      if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
> >>>>>> -        cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
> >>>>>> +        if (lpc->smi_negotiated_features &
> >>>>>> +            (UINT64_C(1) << ICH9_LPC_SMI_F_BROADCAST_BIT)) {
> >>>>>> +            CPUState *cs;
> >>>>>> +            CPU_FOREACH(cs) {
> >>>>>> +                cpu_interrupt(cs, CPU_INTERRUPT_SMI);
> >>>>>> +            }      
> >>>>> Shouldn't CPUs with default SMI base be excluded from broadcast?      
> >>>>
> >>>> I don't think so; as far as I recall, in edk2 the initial SMM entry code
> >>>> -- before SMBASE relocation -- accommodates all VCPUs entering the same
> >>>> code area for execution. The VCPUs are told apart from each other by
> >>>> Initial APIC ID (that is, "who am I"), which is used as an index or
> >>>> search criterion into a global shared array. Once they find their
> >>>> respective private data blocks, the VCPUs don't step on each other's 
> >>>> toes.    
> >>> That's what I'm not sure.
> >>> If broadcast SMI is sent before relocation all CPUs will use
> >>> the same SMBASE and as result save their state in the same
> >>> memory (even if it's state after RESET/POWER ON) racing/overwriting
> >>> each other's saved state    
> >>
> >> Good point!
> >>  
> >>> and then on exit from SMI they all will restore
> >>> whatever state that race produced so it seems plain wrong thing to do.
> >>>
> >>> Are you sure that edk2 does broadcast for SMI initialization or does it
> >>> using directed SMIs?    
> >>
> >> Hmmm, you are right. The SmmRelocateBases() function in
> >> "UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c" first relocates SMBASE for
> >> each individual AP in succession, by sending SMI IPIs to them, one by
> >> one. Then it does the same for the BSP, by sending itself a similar SMI 
> >> IPI.
> >>
> >> The above QEMU code is only exercised much later, after the SMBASE
> >> relocation, with the SMM stack up and running. It is used when a
> >> (preferably broadcast) SMI is triggered for firmware services (for
> >> example, the UEFI variable services).
> >>
> >> So, you are right.
> >>
> >> Considering edk2 only, the difference shouldn't matter -- when this code
> >> is invoked (via an IO port write), the SMBASEs will all have been 
> >> relocated.
> >>
> >> Also, I've been informed that on real hardware, writing to APM_CNT
> >> triggers an SMI on all CPUs, regardless of the individual SMBASE values.
> >> So I believe the above code should be closest to real hardware, and the
> >> pre-patch code was only written in unicast form for SeaBIOS's sake.
> >>
> >> I think the code is safe above. If the guest injects an SMI via APM_CNT
> >> after negotiating SMI broadcast, it should have not left any VCPUs
> >> without an SMI handler by that time. edk2 / OVMF conforms to this. In
> >> the future we can tweak this further, if necessary; we have 63 more
> >> bits, and we'll be able to reject invalid combinations of bits.
> >>
> >> Do you feel that we should skip VCPUs whose SMBASE has not been changed
> >> from the default?
> >>
> >> If so, doesn't that run the risk of missing a VCPU that has an actual
> >> SMI handler installed, and it just happens to be placed at the default
> >> SMBASE location?
> >>
> >> ... I wouldn't mind skipping VCPUs with apparently unrelocated SMBASEs,
> >> but I think (a) that's not what real hardware does, and (b) it is risky
> >> if a VCPU's actual SMI handler has been installed while keeping the
> >> default SMBASE value.
> >>
> >> What do you think?  
> > (a) probably doesn't consider hotplug, so it's totally fine for the case
> > where firmware is the only one who wakes up/initializes CPUs.
> > however consider 2 CPU hotplugged and then broadcast SMI happens
> > to serve some other task (if there isn't unpark 'feature').
> > Even if real hw does it, it's behavior not defined by SDM with possibly
> > bad consequences.
> > 
> > (b) alone it's risky so I'd skip based on combination of
> >  
> >  if (default SMBASE & CPU is in reset state)
> >     skip;
> > 
> > that should be safe and it leaves open possibility to avoid adding
> > unpark 'feature' to CPU.  
> 
> The above attributes ("SMBASE" and "CPU is in reset state") are specific
> to x86 VCPUs. Should I use some dynamic QOM casting for this? Like the
> X86_CPU() macro?
> 
> Can I assume here that the macro will never return NULL? (I think so,
> LPC is only used with x86.)
yep, that should work.

> 
> ... I guess SMBASE can be accessed as
> 
>   X86_CPU(cs)->env.smbase
preferred style:
    X86CPU *cpu = X86_CPU(cs);
    cpu->...

> 
> How do I figure out if the CPU is in reset state ("waiting for first
> INIT") though? Note that this state should be distinguished from simply
> "halted" (i.e., after a HLT). After a HLT, the SMI should be injected
> and delivered; see commit a9bad65d2c1f. Edk2 regularly uses HLT to send
> APs to sleep.

how about using EIP reset value?
x86_cpu_reset()
   ...
   env->eip = 0xfff0;

> 
> Thanks
> Laszlo
> 
> >   
> >>
> >> Thanks!
> >> Laszlo
> >>  
> >>>     
> >>>>
> >>>> Thanks
> >>>> Laszlo
> >>>>    
> >>>>>       
> >>>>>> +        } else {
> >>>>>> +            cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
> >>>>>> +        }
> >>>>>>      }
> >>>>>>  }
> >>>>>>        
> >>>>>       
> >>>>    
> >>>     
> >>  
> >   
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]