[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] hw/arm/virt: Support NMI injection

From: Gavin Shan
Subject: Re: [RFC PATCH] hw/arm/virt: Support NMI injection
Date: Wed, 29 Jan 2020 14:41:10 +1100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0

On 1/29/20 1:44 PM, Alexey Kardashevskiy wrote:

On 28/01/2020 17:48, Gavin Shan wrote:
[including more folks into the discussion]

On Fri, 17 Jan 2020 at 14:00, Peter Maydell <address@hidden>
On Thu, 19 Dec 2019 at 04:06, Gavin Shan <address@hidden> wrote:
This supports NMI injection for virtual machine and currently it's only
supported on GICv3 controller, which is emulated by qemu or host
The design is highlighted as below:

* The NMI is identified by its priority (0x20). In the guest (linux)
kernel, the GICC_PMR is set to 0x80, to block all interrupts except
the NMIs when the external interrupt is disabled. It means the FIQ
and IRQ bit in PSTATE isn't touched when the functionality (NMI) is
* LPIs aren't considered as NMIs because of their nature. It means NMI
is either SPI or PPI. Besides, the NMIs are injected in round-robin
fashion is there are multiple NMIs existing.
* When the GICv3 controller is emulated by qemu, the interrupt states
(e.g. enabled, priority) is fetched from the corresponding data struct
directly. However, we have to pause all CPUs to fetch the interrupt
states from host in advance if the GICv3 controller is emulated by

The testing scenario is to tweak guest (linux) kernel where the
pl011 SPI
can be enabled as NMI by request_nmi(). Check "/proc/interrupts"
after injecting
several NMIs, to see if the interrupt count is increased or not. The
is just as expected.

So, QEMU is trying to emulate actual hardware. None of this
looks to me like what GICv3 hardware does... If you want to
have the virt board send an interrupt, do it the usual way
by wiring up a qemu_irq from some device to the GIC, please.
(More generally, there is no concept of an "NMI" in the GIC;
there are just interrupts at varying possible guest-programmable
priority levels.)

Peter, I missed to read your reply in time and apologies for late response.

Yes, there is no concept of "NMI" in the GIC from hardware perspective.
However, NMI has been supported from the software by kernel commit
bc3c03ccb4641 ("arm64: Enable the support of pseudo-NMIs"). The NMIs
have higher priority than normal ones. NMIs are deliverable after
local_irq_disable() because the SYS_ICC_PMR_EL1 is tweaked so that
normal interrupts are masked only.

It's unclear about the purpose of "nmi" QMP/HMP command. It's why I
put a RFC tag. The command has been supported by multiple architects
including x86/ppc. However, they are having different behaviors. The
system will be restarted on ppc with this command,

We inject "system reset" as it is the closest thing to the idea of NMI
(could be a "machine check").

The system behaviour is configurable on POWERPC, it is either kdump
(store a system dump and reboot) or simple reboot or activate XMON
(in-kernel debugger, needs to be enabled beforehand).

The injector in QEMU is called NMIClass::nmi_monitor_handler and as the
name suggests it is not an NMI (the hardware concept which x86 may be
still has and others do not) but an "nmi" command of the QEMU monitor
which is rather a debug tool - "kick an unresponsive guest" - for us

Alexey, thanks for the explanation. The behavior for PowerPC is clear now :)

but a NMI is injected
through LAPIC on x86. So I'm not sure what architect (system reset on
ppc or injecting NMI on x86) aarch64 should follow.

I'd say whatever triggers in-kernel debugger or kdump but I am not
familiar with ARM at all :)

For x86, the behavior is really depending the NMI handler. Currently, it
seems nothing other than outputting below messages. However, it's configurable
to get a system crash via "/proc/sys/kernel/unknown_nmi_panic"

(qemu) nmi
[ 6731.137504] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[ 6731.137511] Do you have a strange power saving mode enabled?
[ 6731.137512] Dazed and confused, but trying to continue

guest# cat /proc/sys/kernel/unknown_nmi_panic
guest# echo 1 > /proc/sys/kernel/unknown_nmi_panic
(qemu) nmi
[ 6852.848600] Do you have a strange power saving mode enabled?
[ 6852.848601] Kernel panic - not syncing: NMI: Not continuing
[ 6852.848602] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc6-gshan+ #21
[ 6852.848604] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
[ 6852.848604] Call Trace:
[ 6852.848605]  <NMI>
[ 6852.848606]  dump_stack+0x6d/0x9a
[ 6852.848607]  panic+0x101/0x2e3
[ 6852.848608]  nmi_panic.cold+0xc/0xc
[ 6852.848609]  unknown_nmi_error.cold+0x46/0x57
[ 6852.848609]  default_do_nmi+0xda/0x110
[ 6852.848610]  do_nmi+0x16e/0x1d0
[ 6852.848611]  end_repeat_nmi+0x16/0x1a
[ 6852.848625] RIP: 0010:native_safe_halt+0xe/0x10
[ 6852.848628] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 
2d 56 bc5
[ 6852.848639] RSP: 0018:ffffffffba603e10 EFLAGS: 00000246
[ 6852.848642] RAX: ffffffffb9ccbdb0 RBX: 0000000000000000 RCX: 0000000000000001
[ 6852.848643] RDX: 00000000000202ce RSI: 0000000000000083 RDI: 0000000000000000
[ 6852.848644] RBP: ffffffffba603e30 R08: 0000063b8ee46b61 R09: 0000000000000201
[ 6852.848645] R10: ffff9e29be53866c R11: 0000000000000018 R12: 0000000000000000
[ 6852.848646] R13: ffffffffba611780 R14: 0000000000000000 R15: 0000000000000000
[ 6852.848647]  ? __sched_text_end+0x1/0x1
[ 6852.848648]  ? native_safe_halt+0xe/0x10
[ 6852.848649]  ? native_safe_halt+0xe/0x10
[ 6852.848650]  </NMI>
[ 6852.848650]  ? default_idle+0x20/0x140
[ 6852.848651]  arch_cpu_idle+0x15/0x20
[ 6852.848652]  default_idle_call+0x23/0x30
[ 6852.848653]  do_idle+0x1fb/0x270
[ 6852.848654]  cpu_startup_entry+0x20/0x30
[ 6852.848655]  rest_init+0xae/0xb0
[ 6852.848656]  arch_call_rest_init+0xe/0x1b
[ 6852.848657]  start_kernel+0x4dd/0x4fd
[ 6852.848658]  x86_64_start_reservations+0x24/0x26
[ 6852.848658]  x86_64_start_kernel+0x75/0x79
[ 6852.848659]  secondary_startup_64+0xa4/0xb0
[ 6852.849153] Kernel Offset: 0x38400000 from 0xffffffff81000000 (relocation 
range: 0x)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]