qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 3397b5: tcg/optimize: Do not attempt to const


From: Peter Maydell
Subject: [Qemu-commits] [qemu/qemu] 3397b5: tcg/optimize: Do not attempt to constant fold neg_vec
Date: Tue, 09 Apr 2024 04:54:03 -0700

  Branch: refs/heads/staging
  Home:   https://github.com/qemu/qemu
  Commit: 3397b5420072babbd739bf17bbd5c9d8ad46ccce
      
https://github.com/qemu/qemu/commit/3397b5420072babbd739bf17bbd5c9d8ad46ccce
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-04 (Thu, 04 Apr 2024)

  Changed paths:
    M tcg/optimize.c
    M tests/tcg/aarch64/Makefile.target
    A tests/tcg/aarch64/test-2150.c

  Log Message:
  -----------
  tcg/optimize: Do not attempt to constant fold neg_vec

Split out the tail of fold_neg to fold_neg_no_const so that we
can avoid attempting to constant fold vector negate.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2150
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 60075a0dedac66052db0a05543646550359a5bf6
      
https://github.com/qemu/qemu/commit/60075a0dedac66052db0a05543646550359a5bf6
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M linux-user/syscall.c

  Log Message:
  -----------
  linux-user: Fix waitid return of siginfo_t and rusage

The copy back to siginfo_t should be conditional only on arg3,
not the specific values that might have been written.
The copy back to rusage was missing entirely.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2262
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: f4f8b8b8d3b31078aee48e1a5c710a025ee19627
      
https://github.com/qemu/qemu/commit/f4f8b8b8d3b31078aee48e1a5c710a025ee19627
  Author: Michael Tokarev <mjt@tls.msk.ru>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M linux-user/syscall.c

  Log Message:
  -----------
  linux-user: do_setsockopt: fix SOL_ALG.ALG_SET_KEY

This setsockopt accepts zero-lengh optlen (current qemu implementation
does not allow this).  Also, there's no need to make a copy of the key,
it is enough to use lock_user() (which accepts zero length already).

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2197
Fixes: f31dddd2fc "linux-user: Add support for setsockopt() option SOL_ALG"
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-2-mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 6ae34421da3df4e05d4ae915ace7c799be9d8b44
      
https://github.com/qemu/qemu/commit/6ae34421da3df4e05d4ae915ace7c799be9d8b44
  Author: Michael Tokarev <mjt@tls.msk.ru>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M linux-user/syscall.c

  Log Message:
  -----------
  linux-user: do_setsockopt: make ip_mreq local to the place it is used and 
inline target_to_host_ip_mreq()

ip_mreq is declared at the beginning of do_setsockopt(), while
it is used in only one place.  Move its declaration to that very
place and replace pointer to alloca()-allocated memory with the
structure itself.

target_to_host_ip_mreq() is used only once, inline it.

This change also properly handles TARGET_EFAULT when the address
is wrong.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-3-mjt@tls.msk.ru>
[rth: Fix braces, adjust optlen to match host structure size]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 01c2896b4a085039b8cd34fef1cc55015f5a78b3
      
https://github.com/qemu/qemu/commit/01c2896b4a085039b8cd34fef1cc55015f5a78b3
  Author: Michael Tokarev <mjt@tls.msk.ru>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M linux-user/syscall.c

  Log Message:
  -----------
  linux-user: do_setsockopt: make ip_mreq_source local to the place where it is 
used

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-4-mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 819358f6a5a0499ff77d2bbf7ab18b7708e8c0d5
      
https://github.com/qemu/qemu/commit/819358f6a5a0499ff77d2bbf7ab18b7708e8c0d5
  Author: Michael Tokarev <mjt@tls.msk.ru>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M linux-user/syscall.c

  Log Message:
  -----------
  linux-user: do_setsockopt: eliminate goto in switch for SO_SNDTIMEO

There's identical code for SO_SNDTIMEO and SO_RCVTIMEO, currently
implemented using an ugly goto into another switch case.  Eliminate
that using arithmetic if, making code flow more natural.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240331100737.2724186-5-mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 5a5ce3f57f3b8d189e2d70f73981876a621b8426
      
https://github.com/qemu/qemu/commit/5a5ce3f57f3b8d189e2d70f73981876a621b8426
  Author: Michael Vogt <mvogt@redhat.com>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M linux-user/ioctls.h
    M linux-user/syscall_defs.h
    M linux-user/syscall_types.h

  Log Message:
  -----------
  linux-user: Add FITRIM ioctl

Tiny patch to add the missing FITRIM ioctl.

Signed-off-by: Michael Vogt <mvogt@redhat.com>
Message-Id: <20240403092048.16023-2-michael.vogt@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 8cacf1c5d591859df90948d692ca927361e63fcf
      
https://github.com/qemu/qemu/commit/8cacf1c5d591859df90948d692ca927361e63fcf
  Author: Nguyen Dinh Phi <phind.uet@gmail.com>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M linux-user/main.c

  Log Message:
  -----------
  linux-user: replace calloc() with g_new0()

Use glib allocation as recommended by the coding convention

Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com>
Message-Id: <20240317171747.1642207-1-phind.uet@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: dc5923e47a0a0e1799923f875820eaf330f13430
      
https://github.com/qemu/qemu/commit/dc5923e47a0a0e1799923f875820eaf330f13430
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M target/hppa/int_helper.c
    M target/hppa/sys_helper.c

  Log Message:
  -----------
  target/hppa: Fix IIAOQ, IIASQ for pa2.0

The contents of IIAOQ depend on PSW_W.
Follow the text in "Interruption Instruction Address Queues",
pages 2-13 through 2-15.

Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Helge Deller <deller@gmx.de>
Reported-by: Sven Schnelle <svens@stackframe.org>
Fixes: b10700d826c ("target/hppa: Update IIAOQ, IIASQ for pa2.0")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 8f78668a3e237752f95f08037374e5eeba6d5274
      
https://github.com/qemu/qemu/commit/8f78668a3e237752f95f08037374e5eeba6d5274
  Author: Zack Buhman <zack@buhman.org>
  Date:   2024-04-06 (Sat, 06 Apr 2024)

  Changed paths:
    M target/sh4/translate.c

  Log Message:
  -----------
  target/sh4: mac.w: memory accesses are 16-bit words

Before this change, executing a code sequence such as:

           mova   tblm,r0
           mov    r0,r1
           mova   tbln,r0
           clrs
           clrmac
           mac.w  @r0+,@r1+
           mac.w  @r0+,@r1+

           .align 4
  tblm:    .word  0x1234
           .word  0x5678
  tbln:    .word  0x9abc
           .word  0xdefg

Does not result in correct behavior:

Expected behavior:
  first macw : macl = 0x1234 * 0x9abc + 0x0
               mach = 0x0

  second macw: macl = 0x5678 * 0xdefg + 0xb00a630
               mach = 0x0

Observed behavior (qemu-sh4eb, prior to this commit):

  first macw : macl = 0x5678 * 0xdefg + 0x0
               mach = 0x0

  second macw: (unaligned longword memory access, SIGBUS)

Various SH-4 ISA manuals also confirm that `mac.w` is a 16-bit word memory
access, not a 32-bit longword memory access.

Signed-off-by: Zack Buhman <zack@buhman.org>
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240402093756.27466-1-zack@buhman.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 9a12317d153bd00b6beaab86a5e861a8ac1438f1
      
https://github.com/qemu/qemu/commit/9a12317d153bd00b6beaab86a5e861a8ac1438f1
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/sh4/cpu.h

  Log Message:
  -----------
  target/sh4: Merge mach and macl into a union

Allow host access to the entire 64-bit accumulator.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 09e3bed95553ebdc4f131e154545b9c811bd00f9
      
https://github.com/qemu/qemu/commit/09e3bed95553ebdc4f131e154545b9c811bd00f9
  Author: Zack Buhman <zack@buhman.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/sh4/helper.h
    M target/sh4/op_helper.c
    M tests/tcg/sh4/Makefile.target
    A tests/tcg/sh4/test-macl.c

  Log Message:
  -----------
  target/sh4: Fix mac.l with saturation enabled

The saturation arithmetic logic in helper_macl is not correct.
I tested and verified this behavior on a SH7091.

Signed-off-by: Zack Buhman <zack@buhman.org>
Message-Id: <20240404162641.27528-2-zack@buhman.org>
[rth: Reformat helper_macl, add a test case.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>


  Commit: 2f1f49cd8efb824497ddafa69cce32695b0c5864
      
https://github.com/qemu/qemu/commit/2f1f49cd8efb824497ddafa69cce32695b0c5864
  Author: Zack Buhman <zack@buhman.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/sh4/helper.h
    M target/sh4/op_helper.c
    M tests/tcg/sh4/Makefile.target
    A tests/tcg/sh4/test-macw.c

  Log Message:
  -----------
  target/sh4: Fix mac.w with saturation enabled

The saturation arithmetic logic in helper_macw is not correct.
I tested and verified this behavior on a SH7091.

Reviewd-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Zack Buhman <zack@buhman.org>
Message-Id: <20240405233802.29128-3-zack@buhman.org>
[rth: Reformat helper_macw, add a test case.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>


  Commit: 2291f448b5f1fd981eaef060493287d11844e11a
      
https://github.com/qemu/qemu/commit/2291f448b5f1fd981eaef060493287d11844e11a
  Author: Zack Buhman <zack@buhman.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/sh4/translate.c

  Log Message:
  -----------
  target/sh4: add missing CHECK_NOT_DELAY_SLOT

CHECK_NOT_DELAY_SLOT is correctly applied to the branch-related
instructions, but not to the PC-relative mov* instructions.

I verified the existence of an illegal slot exception on a SH7091 when
any of these instructions are attempted inside a delay slot.

This also matches the behavior described in the SH-4 ISA manual.

Signed-off-by: Zack Buhman <zack@buhman.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240407150705.5965-1-zack@buhman.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewd-by: Yoshinori Sato <ysato@users.sourceforge.jp>


  Commit: 9730dd94167e716cd7fcc533e247a7c8309951eb
      
https://github.com/qemu/qemu/commit/9730dd94167e716cd7fcc533e247a7c8309951eb
  Author: Keith Packard <keithp@keithp.com>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/m68k/cpu.c
    M target/m68k/cpu.h
    M target/m68k/fpu_helper.c
    M target/m68k/helper.c
    M target/m68k/helper.h
    M target/m68k/translate.c

  Log Message:
  -----------
  target/m68k: Map FPU exceptions to FPSR register

Add helpers for reading/writing the 68881 FPSR register so that
changes in floating point exception state can be seen by the
application.

Call these helpers in pre_load/post_load hooks to synchronize
exception state.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230803035231.429697-1-keithp@keithp.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 3eac48a7e1a94c85561ffc5baea01e40638f4c63
      
https://github.com/qemu/qemu/commit/3eac48a7e1a94c85561ffc5baea01e40638f4c63
  Author: Keith Packard <keithp@keithp.com>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/m68k/m68k-semi.c

  Log Message:
  -----------
  target/m68k: Pass semihosting arg to exit

Instead of using d0 (the semihost function number), use d1 (the
provide exit status).

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230802161914.395443-2-keithp@keithp.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 04a23882367cf72ee2189eb47f42a6803ba5cbe5
      
https://github.com/qemu/qemu/commit/04a23882367cf72ee2189eb47f42a6803ba5cbe5
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/m68k/cpu.h
    M target/m68k/op_helper.c
    M target/m68k/translate.c

  Log Message:
  -----------
  target/m68k: Perform the semihosting test during translate

Replace EXCP_HALT_INSN by EXCP_SEMIHOSTING.  Perform the pre-
and post-insn tests during translate, leaving only the actual
semihosting operation for the exception.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 92b16b4d406a4055c6ea69f4a76de4715eb2583d
      
https://github.com/qemu/qemu/commit/92b16b4d406a4055c6ea69f4a76de4715eb2583d
  Author: Keith Packard <keithp@keithp.com>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/m68k/translate.c

  Log Message:
  -----------
  target/m68k: Support semihosting on non-ColdFire targets

According to the m68k semihosting spec:

"The instruction used to trigger a semihosting request depends on the
 m68k processor variant.  On ColdFire, "halt" is used; on other processors
 (which don't implement "halt"), "bkpt #0" may be used."

Add support for non-CodeFire processors by matching BKPT #0 instructions.

Signed-off-by: Keith Packard <keithp@keithp.com>
[rth: Use semihosting_test()]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: d76c86bf56de2365340d79bd4f3d3eb1023ae403
      
https://github.com/qemu/qemu/commit/d76c86bf56de2365340d79bd4f3d3eb1023ae403
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M include/tcg/tcg.h
    M tcg/tcg.c

  Log Message:
  -----------
  tcg: Add TCGContext.emit_before_op

Allow operations to be emitted via normal expanders
into the middle of the opcode stream.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: f3572df79d13e1dd6e6c0ea4b1eaca1a4828c3fa
      
https://github.com/qemu/qemu/commit/f3572df79d13e1dd6e6c0ea4b1eaca1a4828c3fa
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M accel/tcg/translator.c
    M include/exec/translator.h

  Log Message:
  -----------
  accel/tcg: Add insn_start to DisasContextBase

This is currently target-specific for many; begin making it
target independent.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 14318e1fff644e7617e5abfc3fb7d7acda423aeb
      
https://github.com/qemu/qemu/commit/14318e1fff644e7617e5abfc3fb7d7acda423aeb
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/arm/tcg/translate-a64.c
    M target/arm/tcg/translate.c
    M target/arm/tcg/translate.h

  Log Message:
  -----------
  target/arm: Use insn_start from DisasContextBase

To keep the multiple update check, replace insn_start
with insn_start_updated.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 3780912a020ca9a1acfeea4742633742a3c6c87e
      
https://github.com/qemu/qemu/commit/3780912a020ca9a1acfeea4742633742a3c6c87e
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/hppa/translate.c

  Log Message:
  -----------
  target/hppa: Use insn_start from DisasContextBase

To keep the multiple update check, replace insn_start
with insn_start_updated.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: fb08b0fd5807fdb38a32630a9854b0b8ce15435a
      
https://github.com/qemu/qemu/commit/fb08b0fd5807fdb38a32630a9854b0b8ce15435a
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/i386/tcg/translate.c

  Log Message:
  -----------
  target/i386: Preserve DisasContextBase.insn_start across rewind

When aborting translation of the current insn, restore the
previous value of insn_start.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: b4047e6131ca2521d0262907684b3afac47ef1e8
      
https://github.com/qemu/qemu/commit/b4047e6131ca2521d0262907684b3afac47ef1e8
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/microblaze/translate.c

  Log Message:
  -----------
  target/microblaze: Use insn_start from DisasContextBase

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 50be57cdf7e858fbbd5984298d54f820074e9c61
      
https://github.com/qemu/qemu/commit/50be57cdf7e858fbbd5984298d54f820074e9c61
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/riscv/translate.c

  Log Message:
  -----------
  target/riscv: Use insn_start from DisasContextBase

To keep the multiple update check, replace insn_start
with insn_start_updated.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 8f2b86000cecabad9aa40c32fc96424458cdfa72
      
https://github.com/qemu/qemu/commit/8f2b86000cecabad9aa40c32fc96424458cdfa72
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M target/s390x/tcg/translate.c

  Log Message:
  -----------
  target/s390x: Use insn_start from DisasContextBase

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 49077924f038cf15cb1ce0927e3d937fc0fdb1c8
      
https://github.com/qemu/qemu/commit/49077924f038cf15cb1ce0927e3d937fc0fdb1c8
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M accel/tcg/translator.c
    M include/exec/translator.h

  Log Message:
  -----------
  accel/tcg: Improve can_do_io management

We already attempted to set and clear can_do_io before the first
and last insns, but only used the initial value of max_insns and
the call to translator_io_start to find those insns.

Now that we track insn_start in DisasContextBase, and now that
we have emit_before_op, we can wait until we have finished
translation to identify the true first and last insns and emit
the sets of can_do_io at that time.

This fixes the case of a translation block which crossed a page
boundary, and for which the second page turned out to be mmio.
In this case we truncate the block, and the previous logic for
can_do_io could leave a block with a single insn with can_do_io
set to false, which would fail an assertion in cpu_io_recompile.

Reported-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 9e6f81d82cefac8706dfd65780625480ae41330b
      
https://github.com/qemu/qemu/commit/9e6f81d82cefac8706dfd65780625480ae41330b
  Author: Alexander Monakov <amonakov@ispras.ru>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Remove SSE4.1 variant

The SSE4.1 variant is virtually identical to the SSE2 variant, except
for using 'PTEST+JNZ' in place of 'PCMPEQB+PMOVMSKB+CMP+JNE' for testing
if an SSE register is all zeroes. The PTEST instruction decodes to two
uops, so it can be handled only by the complex decoder, and since
CMP+JNE are macro-fused, both sequences decode to three uops. The uops
comprising the PTEST instruction dispatch to p0 and p5 on Intel CPUs, so
PCMPEQB+PMOVMSKB is comparatively more flexible from dispatch
standpoint.

Hence, the use of PTEST brings no benefit from throughput standpoint.
Its latency is not important, since it feeds only a conditional jump,
which terminates the dependency chain.

I never observed PTEST variants to be faster on real hardware.

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-2-amonakov@ispras.ru>


  Commit: 8c3b8444cce1623b9db118c398491eb0d002db5b
      
https://github.com/qemu/qemu/commit/8c3b8444cce1623b9db118c398491eb0d002db5b
  Author: Alexander Monakov <amonakov@ispras.ru>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Remove AVX512 variant

Thanks to early checks in the inline buffer_is_zero wrapper, the SIMD
routines are invoked much more rarely in normal use when most buffers
are non-zero. This makes use of AVX512 unprofitable, as it incurs extra
frequency and voltage transition periods during which the CPU operates
at reduced performance, as described in
https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html

Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-4-amonakov@ispras.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 782ff6a3200c59dd284ca708f08cbe9d70825b86
      
https://github.com/qemu/qemu/commit/782ff6a3200c59dd284ca708f08cbe9d70825b86
  Author: Alexander Monakov <amonakov@ispras.ru>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M include/qemu/cutils.h
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Reorganize for early test for acceleration

Test for length >= 256 inline, where is is often a constant.
Before calling into the accelerated routine, sample three bytes
from the buffer, which handles most non-zero buffers.

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Message-Id: <20240206204809.9859-3-amonakov@ispras.ru>
[rth: Use __builtin_constant_p; move the indirect call out of line.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 82b35c712d1a5e28cd3f3b498ca24fc56d489e7d
      
https://github.com/qemu/qemu/commit/82b35c712d1a5e28cd3f3b498ca24fc56d489e7d
  Author: Alexander Monakov <amonakov@ispras.ru>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Remove useless prefetches

Use of prefetching in bufferiszero.c is quite questionable:

- prefetches are issued just a few CPU cycles before the corresponding
  line would be hit by demand loads;

- they are done for simple access patterns, i.e. where hardware
  prefetchers can perform better;

- they compete for load ports in loops that should be limited by load
  port throughput rather than ALU throughput.

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-5-amonakov@ispras.ru>


  Commit: 2463e5b6df0d0e29fc3b346682cf883c53b16bca
      
https://github.com/qemu/qemu/commit/2463e5b6df0d0e29fc3b346682cf883c53b16bca
  Author: Alexander Monakov <amonakov@ispras.ru>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Optimize SSE2 and AVX2 variants

Increase unroll factor in SIMD loops from 4x to 8x in order to move
their bottlenecks from ALU port contention to load issue rate (two loads
per cycle on popular x86 implementations).

Avoid using out-of-bounds pointers in loop boundary conditions.

Follow SSE2 implementation strategy in the AVX2 variant. Avoid use of
PTEST, which is not profitable there (like in the removed SSE4 variant).

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-6-amonakov@ispras.ru>


  Commit: 510f19ab99feacb49ad4b6da83acf302db43a268
      
https://github.com/qemu/qemu/commit/510f19ab99feacb49ad4b6da83acf302db43a268
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Improve scalar variant

Split less-than and greater-than 256 cases.
Use unaligned accesses for head and tail.
Avoid using out-of-bounds pointers in loop boundary conditions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 1500374e532e292ae13b5d6591fe5f7234d4c3ba
      
https://github.com/qemu/qemu/commit/1500374e532e292ae13b5d6591fe5f7234d4c3ba
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Introduce biz_accel_fn typedef

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: 50dbeda88ab71f9d426b7f4b126c79c44860e475
      
https://github.com/qemu/qemu/commit/50dbeda88ab71f9d426b7f4b126c79c44860e475
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2024-04-08 (Mon, 08 Apr 2024)

  Changed paths:
    M util/bufferiszero.c

  Log Message:
  -----------
  util/bufferiszero: Simplify test_buffer_is_zero_next_accel

Because the three alternatives are monotonic, we don't
need to keep a couple of bitmasks, just identify the
strongest alternative at startup.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


  Commit: e8cb49c07f57a8b9011a94359263103d9ad37029
      
https://github.com/qemu/qemu/commit/e8cb49c07f57a8b9011a94359263103d9ad37029
  Author: Peter Maydell <peter.maydell@linaro.org>
  Date:   2024-04-09 (Tue, 09 Apr 2024)

  Changed paths:
    M accel/tcg/translator.c
    M include/exec/translator.h
    M include/qemu/cutils.h
    M include/tcg/tcg.h
    M linux-user/ioctls.h
    M linux-user/main.c
    M linux-user/syscall.c
    M linux-user/syscall_defs.h
    M linux-user/syscall_types.h
    M target/arm/tcg/translate-a64.c
    M target/arm/tcg/translate.c
    M target/arm/tcg/translate.h
    M target/hppa/int_helper.c
    M target/hppa/sys_helper.c
    M target/hppa/translate.c
    M target/i386/tcg/translate.c
    M target/m68k/cpu.c
    M target/m68k/cpu.h
    M target/m68k/fpu_helper.c
    M target/m68k/helper.c
    M target/m68k/helper.h
    M target/m68k/m68k-semi.c
    M target/m68k/op_helper.c
    M target/m68k/translate.c
    M target/microblaze/translate.c
    M target/riscv/translate.c
    M target/s390x/tcg/translate.c
    M target/sh4/cpu.h
    M target/sh4/helper.h
    M target/sh4/op_helper.c
    M target/sh4/translate.c
    M tcg/optimize.c
    M tcg/tcg.c
    M tests/tcg/aarch64/Makefile.target
    A tests/tcg/aarch64/test-2150.c
    M tests/tcg/sh4/Makefile.target
    A tests/tcg/sh4/test-macl.c
    A tests/tcg/sh4/test-macw.c
    M util/bufferiszero.c

  Log Message:
  -----------
  Merge tag 'pull-misc-20240408' of https://gitlab.com/rth7680/qemu into staging

util/bufferiszero: Optimizations and cleanups, esp code removal
target/m68k: Semihosting for non-coldfire cpus
target/m68k: Fix fp accrued exception reporting
target/hppa: Fix IIAOQ, IIASQ for pa2.0
target/sh4: Fixes to mac.l and mac.w saturation
target/sh4: Fixes to illegal delay slot reporting
linux-user: Cleanups for do_setsockopt
linux-user: Add FITRIM ioctl
linux-user: Fix waitid return of siginfo_t and rusage
tcg/optimize: Do not attempt to constant fold neg_vec
accel/tcg: Improve can_do_io management, mmio bug fix

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmYULZgdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+gIAgAqhtlPsZbnoXwqCNs
# fafa/lTvpZYHl2kMJVRYtMmU661HD2HGARe3XCc7/5ZldvEXeQKPde9VrhmasIe8
# EChS3xh1U3J2zEUbnHHgnC9DDVE7uvG0lTazXQRe6WESTaRuBz5d6a0GZtSkH6F4
# AFp0lyKAxX4cn07aBFg0MJ/oPe21Ay9tTQv+5Ox2JOSvaK+FbW7hXisyReF5MVwq
# WPQSELCSppdowxAcsHCD5Q8t/nwGfBbHKOjxLJCgf9xX1+9Wv3Ab8kMjRqaJWdXu
# CvvJ/DigLZiHWbU2TxR2dPOIFSNZOIJY/BJ92aCHw7q9a4Ii2tkTPoJoM/qqlBq8
# 0oLPXg==
# =00NK
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 08 Apr 2024 18:47:04 BST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" 
[full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-misc-20240408' of https://gitlab.com/rth7680/qemu: (35 commits)
  util/bufferiszero: Simplify test_buffer_is_zero_next_accel
  util/bufferiszero: Introduce biz_accel_fn typedef
  util/bufferiszero: Improve scalar variant
  util/bufferiszero: Optimize SSE2 and AVX2 variants
  util/bufferiszero: Remove useless prefetches
  util/bufferiszero: Reorganize for early test for acceleration
  util/bufferiszero: Remove AVX512 variant
  util/bufferiszero: Remove SSE4.1 variant
  accel/tcg: Improve can_do_io management
  target/s390x: Use insn_start from DisasContextBase
  target/riscv: Use insn_start from DisasContextBase
  target/microblaze: Use insn_start from DisasContextBase
  target/i386: Preserve DisasContextBase.insn_start across rewind
  target/hppa: Use insn_start from DisasContextBase
  target/arm: Use insn_start from DisasContextBase
  accel/tcg: Add insn_start to DisasContextBase
  tcg: Add TCGContext.emit_before_op
  target/m68k: Support semihosting on non-ColdFire targets
  target/m68k: Perform the semihosting test during translate
  target/m68k: Pass semihosting arg to exit
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>


Compare: https://github.com/qemu/qemu/compare/bc0cd4ae881d...e8cb49c07f57

To unsubscribe from these emails, change your notification settings at 
https://github.com/qemu/qemu/settings/notifications



reply via email to

[Prev in Thread] Current Thread [Next in Thread]