[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 15/24] accel/tcg: Use interval tree for TBs in user-only mode
From: |
Alex Bennée |
Subject: |
Re: [PATCH 15/24] accel/tcg: Use interval tree for TBs in user-only mode |
Date: |
Tue, 25 Oct 2022 16:58:19 +0100 |
User-agent: |
mu4e 1.9.1; emacs 28.2.50 |
Richard Henderson <richard.henderson@linaro.org> writes:
> Begin weaning user-only away from PageDesc.
>
> Since, for user-only, all TB (and page) manipulation is done with
> a single mutex, and there is no virtual/physical discontinuity to
> split a TB across discontinuous pages, place all of the TBs into
> a single IntervalTree. This makes it trivial to find all of the
> TBs intersecting a range.
>
> Retain the existing PageDesc + linked list implementation for
> system mode. Move the portion of the implementation that overlaps
> the new user-only code behind the common ifdef.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/internal.h | 16 +-
> include/exec/exec-all.h | 43 ++++-
> accel/tcg/tb-maint.c | 388 ++++++++++++++++++++++----------------
> accel/tcg/translate-all.c | 4 +-
> 4 files changed, 280 insertions(+), 171 deletions(-)
>
<snip>
> diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
> index c8e921089d..14e8e47a6a 100644
> --- a/accel/tcg/tb-maint.c
> +++ b/accel/tcg/tb-maint.c
> @@ -18,6 +18,7 @@
> */
>
> #include "qemu/osdep.h"
> +#include "qemu/interval-tree.h"
> #include "exec/cputlb.h"
> #include "exec/log.h"
> #include "exec/exec-all.h"
> @@ -50,6 +51,75 @@ void tb_htable_init(void)
> qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
> }
I wonder for the sake of avoiding recompilation of units later on and
having a clean separation between user and system mode it would be worth
putting this stuff in a tb-maint-user.c?
>
> +#ifdef CONFIG_USER_ONLY
> +/*
> + * For user-only, since we are protecting all of memory with a single lock,
> + * and because the two pages of a TranslationBlock are always contiguous,
> + * use a single data structure to record all TranslationBlocks.
> + */
<snip>
> +
> +/*
> + * Called with mmap_lock held. If pc is not 0 then it indicates the
> + * host PC of the faulting store instruction that caused this invalidate.
> + * Returns true if the caller needs to abort execution of the current
> + * TB (because it was modified by this store and the guest CPU has
> + * precise-SMC semantics).
> + */
> +bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
> +{
> + assert(pc != 0);
> +#ifdef TARGET_HAS_PRECISE_SMC
> + assert_memory_lock();
Out of interest is this just because x86 has such a strong memory model
you can get away with this sort of patching without explicit flushes?
I'm curious why this is the only arch we jump through these hoops for?
> + {
> + TranslationBlock *current_tb = tcg_tb_lookup(pc);
> + bool current_tb_modified = false;
> + TranslationBlock *tb;
> + PageForEachNext n;
> +
> + addr &= TARGET_PAGE_MASK;
> +
> + PAGE_FOR_EACH_TB(addr, addr + TARGET_PAGE_SIZE, unused, tb, n) {
> + if (current_tb == tb &&
> + (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
> + /*
> + * If we are modifying the current TB, we must stop its
> + * execution. We could be more precise by checking that
> + * the modification is after the current PC, but it would
> + * require a specialized function to partially restore
> + * the CPU state.
> + */
> + current_tb_modified = true;
> + cpu_restore_state_from_tb(current_cpu, current_tb, pc, true);
> + }
> + tb_phys_invalidate__locked(tb);
> + }
> +
> + if (current_tb_modified) {
> + /* Force execution of one insn next time. */
> + CPUState *cpu = current_cpu;
> + cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(current_cpu);
> + return true;
> + }
> + }
> +#else
> + tb_invalidate_phys_page(addr);
> +#endif /* TARGET_HAS_PRECISE_SMC */
> + return false;
> +}
> +#else
> /*
> * @p must be non-NULL.
> * user-mode: call with mmap_lock held.
> @@ -492,22 +637,17 @@ tb_invalidate_phys_page_range__locked(struct
> page_collection *pages,
> {
> TranslationBlock *tb;
> tb_page_addr_t tb_start, tb_end;
> - int n;
> + PageForEachNext n;
> #ifdef TARGET_HAS_PRECISE_SMC
> - CPUState *cpu = current_cpu;
> - bool current_tb_not_found = retaddr != 0;
> bool current_tb_modified = false;
> - TranslationBlock *current_tb = NULL;
> + TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
> #endif /* TARGET_HAS_PRECISE_SMC */
>
> - assert_page_locked(p);
> -
> /*
> * We remove all the TBs in the range [start, end[.
> * XXX: see if in some cases it could be faster to invalidate all the
> code
> */
I'm guessing this comment is quite stale now given we try quite hard to
avoid doing lots of code gen over and over again. The only case I can
think of is memory clear routines after we've had code which there might
be some heuristics we could use to detect but don't currently.
<snip>
Otherwise:
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
- Re: [PATCH 12/24] accel/tcg: Unify declarations of tb_invalidate_phys_range, (continued)
- [PATCH 14/24] accel/tcg: Call tb_invalidate_phys_page for PAGE_RESET, Richard Henderson, 2022/10/05
- [PATCH 13/24] accel/tcg: Use tb_invalidate_phys_page in page_set_flags, Richard Henderson, 2022/10/05
- [PATCH 17/24] accel/tcg: Use tb_invalidate_phys_range in page_set_flags, Richard Henderson, 2022/10/05
- [PATCH 16/24] accel/tcg: Use page_reset_target_data in page_set_flags, Richard Henderson, 2022/10/05
- [PATCH 15/24] accel/tcg: Use interval tree for TBs in user-only mode, Richard Henderson, 2022/10/05
- Re: [PATCH 15/24] accel/tcg: Use interval tree for TBs in user-only mode,
Alex Bennée <=
- [PATCH 18/24] accel/tcg: Move TARGET_PAGE_DATA_SIZE impl to user-exec.c, Richard Henderson, 2022/10/05
- [PATCH 19/24] accel/tcg: Simplify page_get/alloc_target_data, Richard Henderson, 2022/10/05
- [PATCH 20/24] accel/tcg: Use interval tree for TARGET_PAGE_DATA_SIZE, Richard Henderson, 2022/10/05
- [PATCH 21/24] accel/tcg: Move page_{get,set}_flags to user-exec.c, Richard Henderson, 2022/10/05