[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PULL 31/35] util/bufferiszero: Remove useless prefetches
From: |
Richard Henderson |
Subject: |
[PULL 31/35] util/bufferiszero: Remove useless prefetches |
Date: |
Mon, 8 Apr 2024 07:49:25 -1000 |
From: Alexander Monakov <amonakov@ispras.ru>
Use of prefetching in bufferiszero.c is quite questionable:
- prefetches are issued just a few CPU cycles before the corresponding
line would be hit by demand loads;
- they are done for simple access patterns, i.e. where hardware
prefetchers can perform better;
- they compete for load ports in loops that should be limited by load
port throughput rather than ALU throughput.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-5-amonakov@ispras.ru>
---
util/bufferiszero.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index 972f394cbd..00118d649e 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -50,7 +50,6 @@ static bool buffer_is_zero_integer(const void *buf, size_t
len)
const uint64_t *e = (uint64_t *)(((uintptr_t)buf + len) & -8);
for (; p + 8 <= e; p += 8) {
- __builtin_prefetch(p + 8);
if (t) {
return false;
}
@@ -80,7 +79,6 @@ buffer_zero_sse2(const void *buf, size_t len)
/* Loop over 16-byte aligned blocks of 64. */
while (likely(p <= e)) {
- __builtin_prefetch(p);
t = _mm_cmpeq_epi8(t, zero);
if (unlikely(_mm_movemask_epi8(t) != 0xFFFF)) {
return false;
@@ -111,7 +109,6 @@ buffer_zero_avx2(const void *buf, size_t len)
/* Loop over 32-byte aligned blocks of 128. */
while (p <= e) {
- __builtin_prefetch(p);
if (unlikely(!_mm256_testz_si256(t, t))) {
return false;
}
--
2.34.1
- [PULL 23/35] target/i386: Preserve DisasContextBase.insn_start across rewind, (continued)
- [PULL 23/35] target/i386: Preserve DisasContextBase.insn_start across rewind, Richard Henderson, 2024/04/08
- [PULL 24/35] target/microblaze: Use insn_start from DisasContextBase, Richard Henderson, 2024/04/08
- [PULL 25/35] target/riscv: Use insn_start from DisasContextBase, Richard Henderson, 2024/04/08
- [PULL 26/35] target/s390x: Use insn_start from DisasContextBase, Richard Henderson, 2024/04/08
- [PULL 27/35] accel/tcg: Improve can_do_io management, Richard Henderson, 2024/04/08
- [PULL 29/35] util/bufferiszero: Remove AVX512 variant, Richard Henderson, 2024/04/08
- [PULL 28/35] util/bufferiszero: Remove SSE4.1 variant, Richard Henderson, 2024/04/08
- [PULL 30/35] util/bufferiszero: Reorganize for early test for acceleration, Richard Henderson, 2024/04/08
- [PULL 22/35] target/hppa: Use insn_start from DisasContextBase, Richard Henderson, 2024/04/08
- [PULL 33/35] util/bufferiszero: Improve scalar variant, Richard Henderson, 2024/04/08
- [PULL 31/35] util/bufferiszero: Remove useless prefetches,
Richard Henderson <=
- [PULL 34/35] util/bufferiszero: Introduce biz_accel_fn typedef, Richard Henderson, 2024/04/08
- [PULL 35/35] util/bufferiszero: Simplify test_buffer_is_zero_next_accel, Richard Henderson, 2024/04/08
- [PULL 32/35] util/bufferiszero: Optimize SSE2 and AVX2 variants, Richard Henderson, 2024/04/08
- Re: [PULL 00/35] misc patch queue, Peter Maydell, 2024/04/09