qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 12/16] translate-all: discard TB when tb_link_pa


From: Alex Bennée
Subject: Re: [Qemu-devel] [PATCH 12/16] translate-all: discard TB when tb_link_page returns an existing matching TB
Date: Thu, 29 Mar 2018 16:19:39 +0100
User-agent: mu4e 1.1.0; emacs 26.0.91

Emilio G. Cota <address@hidden> writes:

> Use the recently-gained QHT feature of returning the matching TB if it
> already exists. This allows us to get rid of the lookup we perform
> right after acquiring tb_lock.
>
> Suggested-by: Richard Henderson <address@hidden>
> Signed-off-by: Emilio G. Cota <address@hidden>
> ---
>  accel/tcg/cpu-exec.c      | 14 ++------------
>  accel/tcg/translate-all.c | 47 
> ++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 40 insertions(+), 21 deletions(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 7c83887..8aed38c 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -243,10 +243,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
>          if (tb == NULL) {
>              mmap_lock();
>              tb_lock();
> -            tb = tb_htable_lookup(cpu, pc, cs_base, flags, cf_mask);
> -            if (likely(tb == NULL)) {
> -                tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
> -            }
> +            tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);

tb_gen_code needs to be renamed to reflect it's semantics.
tb_get_or_gen_code? Or maybe tb_get_code with a sub-helper to do the
generation.

>              tb_unlock();
>              mmap_unlock();
>          }
> @@ -396,14 +393,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
>          tb_lock();
>          acquired_tb_lock = true;
>
> -        /* There's a chance that our desired tb has been translated while
> -         * taking the locks so we check again inside the lock.
> -         */
> -        tb = tb_htable_lookup(cpu, pc, cs_base, flags, cf_mask);
> -        if (likely(tb == NULL)) {
> -            /* if no translated code available, then translate it now */
> -            tb = tb_gen_code(cpu, pc, cs_base, flags, cf_mask);
> -        }
> +        tb = tb_gen_code(cpu, pc, cs_base, flags, cf_mask);
>
>          mmap_unlock();
>          /* We add the TB in the virtual pc hash table for the fast lookup */
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 82832ef..dbe6c12 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1503,12 +1503,16 @@ static inline void tb_page_add(PageDesc *p, 
> TranslationBlock *tb,
>   * (-1) to indicate that only one page contains the TB.
>   *
>   * Called with mmap_lock held for user-mode emulation.
> + *
> + * Returns @tb or an existing TB that matches @tb.

That's just confusing to read. So this returns a TB like the @tb we
passed in but actually a different one matching the same conditions?

>   */
> -static void tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
> -                         tb_page_addr_t phys_page2)
> +static TranslationBlock *
> +tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
> +             tb_page_addr_t phys_page2)
>  {
>      PageDesc *p;
>      PageDesc *p2 = NULL;
> +    void *existing_tb;
>      uint32_t h;
>
>      assert_memory_lock();
> @@ -1516,6 +1520,11 @@ static void tb_link_page(TranslationBlock *tb, 
> tb_page_addr_t phys_pc,
>      /*
>       * Add the TB to the page list.
>       * To avoid deadlock, acquire first the lock of the lower-addressed page.
> +     * We keep the locks held until after inserting the TB in the hash table,
> +     * so that if the insertion fails we know for sure that the TBs are still
> +     * in the page descriptors.
> +     * Note that inserting into the hash table first isn't an option, since
> +     * we can only insert TBs that are fully initialized.
>       */
>      p = page_find_alloc(phys_pc >> TARGET_PAGE_BITS, 1);
>      if (likely(phys_page2 == -1)) {
> @@ -1535,21 +1544,33 @@ static void tb_link_page(TranslationBlock *tb, 
> tb_page_addr_t phys_pc,
>          tb_page_add(p2, tb, 1, phys_page2);
>      }
>
> +    /* add in the hash table */
> +    h = tb_hash_func(phys_pc, tb->pc, tb->flags, tb->cflags & CF_HASH_MASK,
> +                     tb->trace_vcpu_dstate);
> +    existing_tb = qht_insert(&tb_ctx.htable, tb, h);

modulo comments about qht_insert API earlier in the series.

> +
> +    /* remove TB from the page(s) if we couldn't insert it */
> +    if (unlikely(existing_tb)) {
> +        tb_page_remove(p, tb);
> +        invalidate_page_bitmap(p);
> +        if (p2) {
> +            tb_page_remove(p2, tb);
> +            invalidate_page_bitmap(p2);
> +        }
> +        tb = existing_tb;
> +    }
> +
>      if (p2) {
>          page_unlock(p2);
>      }
>      page_unlock(p);
>
> -    /* add in the hash table */
> -    h = tb_hash_func(phys_pc, tb->pc, tb->flags, tb->cflags & CF_HASH_MASK,
> -                     tb->trace_vcpu_dstate);
> -    qht_insert(&tb_ctx.htable, tb, h);
> -
>  #ifdef CONFIG_USER_ONLY
>      if (DEBUG_TB_CHECK_GATE) {
>          tb_page_check();
>      }
>  #endif
> +    return tb;
>  }
>
>  /* Called with mmap_lock held for user mode emulation.  */
> @@ -1558,7 +1579,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>                                uint32_t flags, int cflags)
>  {
>      CPUArchState *env = cpu->env_ptr;
> -    TranslationBlock *tb;
> +    TranslationBlock *tb, *existing_tb;
>      tb_page_addr_t phys_pc, phys_page2;
>      target_ulong virt_page2;
>      tcg_insn_unit *gen_code_buf;
> @@ -1706,7 +1727,15 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>       * memory barrier is required before tb_link_page() makes the TB visible
>       * through the physical hash table and physical page list.
>       */
> -    tb_link_page(tb, phys_pc, phys_page2);
> +    existing_tb = tb_link_page(tb, phys_pc, phys_page2);
> +    /* if the TB already exists, discard what we just translated */

So are we in the position now that we could potentially do a translation
but be beaten by another thread generating the same code? I suspect we could
do with a bit of explanatory commentary for the tb_gen_code functions.

Also I think the "Translation Blocks" section needs updating in the
MTTCG design document to make this clear.

I'm curious if we should be counting unused translations somewhere in
the JIT stats. I'm guessing you need to work at a pathalogical case to
hit this much?

> +    if (unlikely(existing_tb != tb)) {
> +        uintptr_t orig_aligned = (uintptr_t)gen_code_buf;
> +
> +        orig_aligned -= ROUND_UP(sizeof(*tb), qemu_icache_linesize);
> +        atomic_set(&tcg_ctx->code_gen_ptr, orig_aligned);
> +        return existing_tb;
> +    }
>      tcg_tb_insert(tb);
>      return tb;
>  }


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]