[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v1 2/7] accel/tcg: suppress IRQ check for special TBs
From: |
Alex Bennée |
Subject: |
Re: [PATCH v1 2/7] accel/tcg: suppress IRQ check for special TBs |
Date: |
Wed, 24 Nov 2021 10:24:53 +0000 |
User-agent: |
mu4e 1.7.5; emacs 28.0.60 |
Richard Henderson <richard.henderson@linaro.org> writes:
> On 11/23/21 9:57 PM, Alex Bennée wrote:
>> Generally when we set cpu->cflags_next_tb it is because we want to
>> carefully control the execution of the next TB. Currently there is a
>> race that causes cflags_next_tb to get ignored if an IRQ is processed
>> before we execute any actual instructions.
>> To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress
>> this check in the generated code so we know we will definitely execute
>> the next block.
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245
>> ---
>> include/exec/exec-all.h | 1 +
>> include/exec/gen-icount.h | 21 +++++++++++++++++----
>> accel/tcg/cpu-exec.c | 14 ++++++++++++++
>> 3 files changed, 32 insertions(+), 4 deletions(-)
>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>> index 6bb2a0f7ec..35d8e93976 100644
>> --- a/include/exec/exec-all.h
>> +++ b/include/exec/exec-all.h
>> @@ -503,6 +503,7 @@ struct TranslationBlock {
>> #define CF_USE_ICOUNT 0x00020000
>> #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock
>> held */
>> #define CF_PARALLEL 0x00080000 /* Generate code for a parallel
>> context */
>> +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */
>> #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
>> #define CF_CLUSTER_SHIFT 24
>> diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
>> index 610cba58fe..c57204ddad 100644
>> --- a/include/exec/gen-icount.h
>> +++ b/include/exec/gen-icount.h
>> @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb)
>> {
>> TCGv_i32 count;
>> - tcg_ctx->exitreq_label = gen_new_label();
>> if (tb_cflags(tb) & CF_USE_ICOUNT) {
>> count = tcg_temp_local_new_i32();
>> } else {
>> @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock
>> *tb)
>> icount_start_insn = tcg_last_op();
>> }
>> - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0,
>> tcg_ctx->exitreq_label);
>> + /*
>> + * Emit the check against icount_decr.u32 to see if we should exit
>> + * unless we suppress the check with CF_NOIRQ. If we are using
>> + * icount and have suppressed interruption the higher level code
>> + * should have ensured we don't run more instructions than the
>> + * budget.
>> + */
>> + if (tb_cflags(tb) & CF_NOIRQ) {
>> + tcg_ctx->exitreq_label = NULL;
>> + } else {
>> + tcg_ctx->exitreq_label = gen_new_label();
>> + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label);
>> + }
>> if (tb_cflags(tb) & CF_USE_ICOUNT) {
>> tcg_gen_st16_i32(count, cpu_env,
>> @@ -74,8 +85,10 @@ static inline void gen_tb_end(const TranslationBlock *tb,
>> int num_insns)
>> tcgv_i32_arg(tcg_constant_i32(num_insns)));
>> }
>> - gen_set_label(tcg_ctx->exitreq_label);
>> - tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED);
>> + if (tcg_ctx->exitreq_label) {
>> + gen_set_label(tcg_ctx->exitreq_label);
>> + tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED);
>> + }
>> }
>> #endif
>
> Split patch here, I think.
Not including the header to cpu_handle_interrupt?
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index 9cb892e326..9e3ed42ceb 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -721,6 +721,15 @@ static inline bool need_replay_interrupt(int
>> interrupt_request)
>> static inline bool cpu_handle_interrupt(CPUState *cpu,
>> TranslationBlock **last_tb)
>> {
>> + /*
>> + * If we have special cflags lets not get distracted with IRQs. We
>> + * shall exit the loop as soon as the next TB completes what it
>> + * needs to do.
>> + */
>
> We will *probably* exit the loop, I think.
>
> With watchpoints, we expect to perform the same memory operation,
> which is expected to hit the watchpoint_hit check in
> cpu_check_watchpoint, which will raise CPU_INTERRUPT_DEBUG.
>
> With SMC, we flush all TBs from the current page, re-execute one insn,
> and then (probably) have to exit to generate the next tb.
>
> With icount, in cpu_loop_exec_tb, we have no idea what's coming; we
> only know that we want no more than N insns to execute.
I think technically we do because all asynchronous interrupt are tied to
the icount (which is part of the budget calculation - icount_get_limit).
In theory we could drop the interrupt check altogether in icount mode
because we should always end and exit to the outer loop when a timer is
going to expire.
> None of which directly exit the loop -- we need the IRQ check at the
> beginning of the *next* TB to exit the loop.
>
> If we want to force an exit from the loop, we need CF_NO_GOTO_TB |
> CF_NO_GOTO_PTR. Which is probably a good idea, at least for
> watchpoints; exit_tb is the fastest way out of the TB to recognize the
> debug interrupt.
>
> The icount usage I find a bit odd. I don't think that we should
> suppress an IRQ in that case -- we can have up to 510 insns
> outstanding on icount_budget, which is clearly far too many to have
> IRQs disabled. I think we should not use cflags_next_tb for this at
> all, but should apply the icount limit later somehow, because an IRQ
> *could* be recognized immediately, with the first TB of the interrupt
> handler running with limited budget, and the icount tick being
> recognized at that point.
I wonder what would happen if we checked u16.high in icount mode? No
timer should ever set it - although I guess it could get set during an
IO operation.
Perhaps really all icount exit checks should be done at the end of
blocks? I suspect that breaks too many assumptions in the rest of the
code.
>
>> + * As we don't want this special TB being interrupted by
>> + * some sort of asynchronous event we apply CF_NOIRQ to
>> + * disable the usual event checking.
>> */
>> cflags = cpu->cflags_next_tb;
>> if (cflags == -1) {
>> cflags = curr_cflags(cpu);
>> } else {
>> + cflags |= CF_NOIRQ;
>> cpu->cflags_next_tb = -1;
>> }
>
> Is it clearer to add NOIRQ here, or back where we set cflags_next_tb
> in the first place?
Are there cases of setting cpu->cflags_next_tb which we are happy to get
preempted by asynchronous events? I guess in the SMC case it wouldn't
matter because when we get back from the IRQ things get reset?
>
>
> r~
--
Alex Bennée
[PATCH v1 3/7] tests/avocado: fix tcg_plugin mem access count test, Alex Bennée, 2021/11/23
[PATCH v1 5/7] gdbstub: handle a potentially racing TaskState, Alex Bennée, 2021/11/23
[PATCH v1 7/7] MAINTAINERS: Add section for Aarch64 GitLab custom runner, Alex Bennée, 2021/11/23
[PATCH v1 6/7] MAINTAINERS: Remove me as a reviewer for the build and test/avocado, Alex Bennée, 2021/11/23