qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 1/1] accel/tcg: Clear PAGE_WRITE before translation


From: Richard Henderson
Subject: Re: [PATCH RFC 1/1] accel/tcg: Clear PAGE_WRITE before translation
Date: Wed, 4 Aug 2021 14:30:50 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 8/4/21 12:46 PM, Ilya Leoshkevich wrote:
translate_insn() implementations fetch instruction bytes piecemeal,
which can cause qemu-user to generate inconsistent translations if
another thread modifies them concurrently [1].

Fix by marking translation block pages non-writable earlier.

[1] https://lists.nongnu.org/archive/html/qemu-devel/2021-08/msg00644.html

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
  accel/tcg/translate-all.c    | 59 +++++++++++++++++++++---------------
  accel/tcg/translator.c       | 26 ++++++++++++++--
  include/exec/translate-all.h |  1 +
  3 files changed, 59 insertions(+), 27 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index bbfcfb698c..fb9ebfad9e 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1297,31 +1297,8 @@ static inline void tb_page_add(PageDesc *p, 
TranslationBlock *tb,
      invalidate_page_bitmap(p);
#if defined(CONFIG_USER_ONLY)
-    if (p->flags & PAGE_WRITE) {
-        target_ulong addr;
-        PageDesc *p2;
-        int prot;
-
-        /* force the host page as non writable (writes will have a
-           page fault + mprotect overhead) */
-        page_addr &= qemu_host_page_mask;
-        prot = 0;
-        for (addr = page_addr; addr < page_addr + qemu_host_page_size;
-            addr += TARGET_PAGE_SIZE) {
-
-            p2 = page_find(addr >> TARGET_PAGE_BITS);
-            if (!p2) {
-                continue;
-            }
-            prot |= p2->flags;
-            p2->flags &= ~PAGE_WRITE;
-          }
-        mprotect(g2h_untagged(page_addr), qemu_host_page_size,
-                 (prot & PAGE_BITS) & ~PAGE_WRITE);
-        if (DEBUG_TB_INVALIDATE_GATE) {
-            printf("protecting code page: 0x" TB_PAGE_ADDR_FMT "\n", 
page_addr);
-        }
-    }
+    /* translator_loop() must have made all TB pages non-writable */
+    assert(!(p->flags & PAGE_WRITE));
  #else
      /* if some code is already present, then the pages are already
         protected. So we handle the case where only the first TB is
@@ -2394,6 +2371,38 @@ int page_check_range(target_ulong start, target_ulong 
len, int flags)
      return 0;
  }
+void page_protect(tb_page_addr_t page_addr)
+{
+    target_ulong addr;
+    PageDesc *p;
+    int prot;
+
+    p = page_find(page_addr >> TARGET_PAGE_BITS);
+    if (p && (p->flags & PAGE_WRITE)) {
+        /*
+         * Force the host page as non writable (writes will have a page fault +
+         * mprotect overhead).
+         */
+        page_addr &= qemu_host_page_mask;
+        prot = 0;
+        for (addr = page_addr; addr < page_addr + qemu_host_page_size;
+             addr += TARGET_PAGE_SIZE) {
+
+            p = page_find(addr >> TARGET_PAGE_BITS);
+            if (!p) {
+                continue;
+            }
+            prot |= p->flags;
+            p->flags &= ~PAGE_WRITE;
+        }
+        mprotect(g2h_untagged(page_addr), qemu_host_page_size,
+                 (prot & PAGE_BITS) & ~PAGE_WRITE);
+        if (DEBUG_TB_INVALIDATE_GATE) {
+            printf("protecting code page: 0x" TB_PAGE_ADDR_FMT "\n", 
page_addr);
+        }
+    }
+}
+
  /* called from signal handler: invalidate the code and unprotect the
   * page. Return 0 if the fault was not handled, 1 if it was handled,
   * and 2 if it was handled but the caller must cause the TB to be
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index c53a7f8e44..bfbe7d7ccf 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -14,6 +14,7 @@
  #include "exec/exec-all.h"
  #include "exec/gen-icount.h"
  #include "exec/log.h"
+#include "exec/translate-all.h"
  #include "exec/translator.h"
  #include "exec/plugin-gen.h"
  #include "sysemu/replay.h"
@@ -47,6 +48,10 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
  {
      uint32_t cflags = tb_cflags(tb);
      bool plugin_enabled;
+    bool stop = false;
+#ifdef CONFIG_USER_ONLY
+    target_ulong page_addr = -1;
+#endif
/* Initialize DisasContext */
      db->tb = tb;
@@ -71,6 +76,21 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
      plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
while (true) {
+#ifdef CONFIG_USER_ONLY
+        /*
+         * Make the page containing the next instruction non-writable in order
+         * to get a consistent translation if another thread is modifying the
+         * code while translate_insn() fetches the instruction bytes piecemeal.
+         * Writer threads will wait for mmap_lock() in page_unprotect().
+         */
+        if ((db->pc_next & TARGET_PAGE_MASK) != page_addr) {
+            page_addr = db->pc_next & TARGET_PAGE_MASK;
+            page_protect(page_addr);
+        }
+#endif
+        if (stop) {
+            break;
+        }

So... I think this isn't quite right.

(1) If we stop exactly at the page break, this protects the *next* page 
unnecessarily.

(2) This only protects the page of the start of the insn. If the instruction crosses the page boundary, then the latter part of the insn is still victim to the race we're trying to fix.

I think a protect needs to happen in translator_ld*_swap, before reading the 
data.

I think that the translator_ld*_swap functions should be moved out of include/exec/translator.h into accel/tcg/translator.c.

I think that the translator_ld* functions should add a DisasContextBase argument, which should then contain the cache for the protection. This will be a moderately large change, but it should be mostly mechanical.

I think that we should initialize the protection cache before translating the first insn, outside of that loop. This will mean that you need not check for two pages simultaneously, when a single read crosses the page boundary. You'll know that (at minimum) the first byte of the first read is already covered, and only need to check the last byte of each subsequent read. I think the value you use for your cache should be the end of the page for which protection is known to apply, so that the check reduces to

  end = pc + len - 1;
  if (end > dcbase->page_protect_end) {
    dcbase->page_protect_end = end | ~TARGET_PAGE_MASK;
    page_protect(end);
  }


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]