qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC PATCH v2 2/2] cputlb: implement load_helper_unaligned() for unalign


From: Philippe Mathieu-Daudé
Subject: [RFC PATCH v2 2/2] cputlb: implement load_helper_unaligned() for unaligned loads
Date: Wed, 9 Jun 2021 16:10:10 +0200

From: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

[RFC because this is currently only lightly tested and there have been some
discussions about whether this should be handled elsewhere in the memory API]

If an unaligned load is required then the load is split into 2 separate accesses
and combined together within load_helper(). This does not work correctly with
MMIO accesses because the original access size is used for both individual
accesses causing the little and big endian combine to return the wrong result.

There is already a similar solution in place for store_helper() where an 
unaligned
access is handled by a separate store_helper_unaligned() function which instead
of using the original access size, uses a single-byte access size to shift and
combine the result correctly regardless of the orignal access size or endian.

Implement a similar load_helper_unaligned() function which uses the same 
approach
for unaligned loads to return the correct result according to the original test
case.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/360
Message-Id: <20210609093528.9616-1-mark.cave-ayland@ilande.co.uk>
[PMD: Extract load_helper_unaligned() in earlier patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 accel/tcg/cputlb.c | 84 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 68 insertions(+), 16 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 2b5d569412c..f8a790d8b4a 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1856,27 +1856,79 @@ load_helper_unaligned(CPUArchState *env, target_ulong 
addr, TCGMemOpIdx oi,
                       uintptr_t retaddr, MemOp op, bool code_read,
                       FullLoadHelper *full_load)
 {
+    uintptr_t mmu_idx = get_mmuidx(oi);
     size_t size = memop_size(op);
-    target_ulong addr1, addr2;
-    uint64_t res;
-    uint64_t r1, r2;
-    unsigned shift;
+    uintptr_t index, index2;
+    CPUTLBEntry *entry, *entry2;
+    const size_t tlb_off = code_read ?
+        offsetof(CPUTLBEntry, addr_code) : offsetof(CPUTLBEntry, addr_read);
+    const MMUAccessType access_type =
+        code_read ? MMU_INST_FETCH : MMU_DATA_LOAD;
+    target_ulong page2, tlb_addr, tlb_addr2;
+    uint64_t val = 0;
+    size_t size2;
+    int i;
 
-    addr1 = addr & ~((target_ulong)size - 1);
-    addr2 = addr1 + size;
-    r1 = full_load(env, addr1, oi, retaddr);
-    r2 = full_load(env, addr2, oi, retaddr);
-    shift = (addr & (size - 1)) * 8;
+    /*
+     * Ensure the second page is in the TLB.  Note that the first page
+     * is already guaranteed to be filled, and that the second page
+     * cannot evict the first.
+     */
+    page2 = (addr + size) & TARGET_PAGE_MASK;
+    size2 = (addr + size) & ~TARGET_PAGE_MASK;
+    index2 = tlb_index(env, mmu_idx, page2);
+    entry2 = tlb_entry(env, mmu_idx, page2);
 
-    if (memop_big_endian(op)) {
-        /* Big-endian combine.  */
-        res = (r1 << shift) | (r2 >> ((size * 8) - shift));
-    } else {
-        /* Little-endian combine.  */
-        res = (r1 >> shift) | (r2 << ((size * 8) - shift));
+    tlb_addr2 = code_read ? entry2->addr_code : entry2->addr_read;
+    if (!tlb_hit_page(tlb_addr2, page2)) {
+        if (!victim_tlb_hit(env, mmu_idx, index2, tlb_off, page2)) {
+            tlb_fill(env_cpu(env), page2, size2, access_type,
+                     mmu_idx, retaddr);
+            index2 = tlb_index(env, mmu_idx, page2);
+            entry2 = tlb_entry(env, mmu_idx, page2);
+        }
+        tlb_addr2 = code_read ? entry2->addr_code : entry2->addr_read;
     }
 
-    return res & MAKE_64BIT_MASK(0, size * 8);
+    index = tlb_index(env, mmu_idx, addr);
+    entry = tlb_entry(env, mmu_idx, addr);
+    tlb_addr = code_read ? entry->addr_code : entry->addr_read;
+
+    /*
+     * Handle watchpoints
+     */
+    if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
+        cpu_check_watchpoint(env_cpu(env), addr, size - size2,
+                             env_tlb(env)->d[mmu_idx].iotlb[index].attrs,
+                             BP_MEM_READ, retaddr);
+    }
+    if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) {
+        cpu_check_watchpoint(env_cpu(env), page2, size2,
+                             env_tlb(env)->d[mmu_idx].iotlb[index2].attrs,
+                             BP_MEM_READ, retaddr);
+    }
+
+    /*
+     * XXX: not efficient, but simple.
+     * This loop must go in the forward direction to avoid issues
+     * with self-modifying code in Windows 64-bit.
+     */
+    oi = make_memop_idx(MO_UB, mmu_idx);
+    if (memop_big_endian(op)) {
+        for (i = 0; i < size; ++i) {
+            /* Big-endian load.  */
+            uint8_t val8 = helper_ret_ldub_mmu(env, addr + i, oi, retaddr);
+            val |= val8 << (((size - 1) * 8) - (i * 8));
+        }
+    } else {
+        for (i = 0; i < size; ++i) {
+            /* Little-endian load.  */
+            uint8_t val8 = helper_ret_ldub_mmu(env, addr + i, oi, retaddr);
+            val |= val8 << (i * 8);
+        }
+    }
+
+    return val & MAKE_64BIT_MASK(0, size * 8);
 }
 
 static inline uint64_t QEMU_ALWAYS_INLINE
-- 
2.31.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]