qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 6/7] tcg: implement JIT for iOS and Apple Silicon


From: Alexander Graf
Subject: Re: [PATCH v5 6/7] tcg: implement JIT for iOS and Apple Silicon
Date: Fri, 20 Nov 2020 15:15:23 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:84.0) Gecko/20100101 Thunderbird/84.0


On 20.11.20 10:08, Alexander Graf wrote:

On 09.11.20 00:24, Joelle van Dyne wrote:
When entitlements are available (macOS or jailbroken iOS), a hardware
feature called APRR exists on newer Apple Silicon that can cheaply mark JIT
pages as either RX or RW. Reverse engineered functions from
libsystem_pthread.dylib are implemented to handle this.

The following rules apply for JIT write protect:
   * JIT write-protect is enabled before tcg_qemu_tb_exec()
   * JIT write-protect is disabled after tcg_qemu_tb_exec() returns
   * JIT write-protect is disabled inside do_tb_phys_invalidate() but if it
     is called inside of tcg_qemu_tb_exec() then write-protect will be
     enabled again before returning.
   * JIT write-protect is disabled by cpu_loop_exit() for interrupt handling.
   * JIT write-protect is disabled everywhere else.

See https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon

Signed-off-by: Joelle van Dyne <j@getutm.app>
---
  include/exec/exec-all.h     |  2 +
  include/tcg/tcg-apple-jit.h | 86 +++++++++++++++++++++++++++++++++++++
  include/tcg/tcg.h           |  3 ++
  accel/tcg/cpu-exec-common.c |  2 +
  accel/tcg/cpu-exec.c        |  2 +
  accel/tcg/translate-all.c   | 46 ++++++++++++++++++++
  tcg/tcg.c                   |  4 ++
  7 files changed, 145 insertions(+)
  create mode 100644 include/tcg/tcg-apple-jit.h

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index aa65103702..3829f3d470 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -549,6 +549,8 @@ TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc,                                      target_ulong cs_base, uint32_t flags,
                                     uint32_t cf_mask);
  void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr);
+void tb_exec_lock(void);
+void tb_exec_unlock(void);
    /* GETPC is the true target of the return instruction that we'll execute.  */
  #if defined(CONFIG_TCG_INTERPRETER)
diff --git a/include/tcg/tcg-apple-jit.h b/include/tcg/tcg-apple-jit.h
new file mode 100644
index 0000000000..9efdb2000d
--- /dev/null
+++ b/include/tcg/tcg-apple-jit.h
@@ -0,0 +1,86 @@
+/*
+ * Apple Silicon functions for JIT handling
+ *
+ * Copyright (c) 2020 osy
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TCG_APPLE_JIT_H
+#define TCG_APPLE_JIT_H
+
+/*
+ * APRR handling
+ * Credits to: https://siguza.github.io/APRR/
+ * Reversed from /usr/lib/system/libsystem_pthread.dylib
+ */
+
+#if defined(__aarch64__) && defined(CONFIG_DARWIN)
+
+#define _COMM_PAGE_START_ADDRESS        (0x0000000FFFFFC000ULL) /* In TTBR0 */
+#define _COMM_PAGE_APRR_SUPPORT (_COMM_PAGE_START_ADDRESS + 0x10C)
+#define _COMM_PAGE_APPR_WRITE_ENABLE (_COMM_PAGE_START_ADDRESS + 0x110)
+#define _COMM_PAGE_APRR_WRITE_DISABLE (_COMM_PAGE_START_ADDRESS + 0x118)
+
+static __attribute__((__always_inline__)) bool jit_write_protect_supported(void)
+{
+    /* Access shared kernel page at fixed memory location. */
+    uint8_t aprr_support = *(volatile uint8_t *)_COMM_PAGE_APRR_SUPPORT;
+    return aprr_support > 0;
+}
+
+/* write protect enable = write disable */
+static __attribute__((__always_inline__)) void jit_write_protect(int enabled)
+{
+    /* Access shared kernel page at fixed memory location. */
+    uint8_t aprr_support = *(volatile uint8_t *)_COMM_PAGE_APRR_SUPPORT;
+    if (aprr_support == 0 || aprr_support > 3) {
+        return;
+    } else if (aprr_support == 1) {
+        __asm__ __volatile__ (
+            "mov x0, %0\n"
+            "ldr x0, [x0]\n"
+            "msr S3_4_c15_c2_7, x0\n"
+            "isb sy\n"
+            :: "r" (enabled ? _COMM_PAGE_APRR_WRITE_DISABLE
+                            : _COMM_PAGE_APPR_WRITE_ENABLE)
+            : "memory", "x0"
+        );
+    } else {
+        __asm__ __volatile__ (
+            "mov x0, %0\n"
+            "ldr x0, [x0]\n"
+            "msr S3_6_c15_c1_5, x0\n"
+            "isb sy\n"
+            :: "r" (enabled ? _COMM_PAGE_APRR_WRITE_DISABLE
+                            : _COMM_PAGE_APPR_WRITE_ENABLE)
+            : "memory", "x0"
+        );
+    }
+}


Is there a particular reason you're not just calling pthread_jit_write_protect_np()? That would remove the dependency on anything reverse engineered.


+
+#else /* defined(__aarch64__) && defined(CONFIG_DARWIN) */
+
+static __attribute__((__always_inline__)) bool jit_write_protect_supported(void)
+{
+    return false;
+}
+
+static __attribute__((__always_inline__)) void jit_write_protect(int enabled)
+{
+}
+
+#endif
+
+#endif /* define TCG_APPLE_JIT_H */
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 477919aeb6..b16b687d0b 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -625,6 +625,9 @@ struct TCGContext {
      size_t code_gen_buffer_size;
      void *code_gen_ptr;
      void *data_gen_ptr;
+#if defined(CONFIG_DARWIN) && !defined(CONFIG_TCG_INTERPRETER)
+    bool code_gen_locked; /* on Darwin each thread tracks W^X flags */


I don't quite understand why you need to keep track of whether you're in locked state or not. If you just always keep in locked state and unlock around the few parts that modify the code gen region, you should be fine, no?


I take this bit back. After fiddling with setting the flags the other way around, I think what you do here is better. Especially when it gets to exception handling, always treating the code region as writeable is better.




+#endif
        /* Threshold to flush the translated code buffer.  */
      void *code_gen_highwater;
diff --git a/accel/tcg/cpu-exec-common.c b/accel/tcg/cpu-exec-common.c
index 12c1e3e974..f1eb767b02 100644
--- a/accel/tcg/cpu-exec-common.c
+++ b/accel/tcg/cpu-exec-common.c
@@ -64,6 +64,8 @@ void cpu_reloading_memory_map(void)
    void cpu_loop_exit(CPUState *cpu)
  {
+    /* Unlock JIT write protect if applicable. */
+    tb_exec_unlock();


Why do you need to unlock here? I think in general this patch is trying to keep the state RW always and only flip to RX when actually executing code, right?

I think it would be much easier and cleaner to do it reverse: Keep it in RX always and flip to RW when you need to modify.

Also, shouldn't the code gen buffer be allocated with MAP_JIT according to the porting guide?


MAP_JIT is definitely missing to make it work on macos.

Also, I would prefer if you find a better name for the lock/unlock function. How about "tcg_set_codegen_mutable(bool)"? You can easily map that to the pthread call then.


Alex





reply via email to

[Prev in Thread] Current Thread [Next in Thread]