qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/7] tcg: conditional set and move opcodes


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 0/7] tcg: conditional set and move opcodes
Date: Fri, 18 Dec 2009 08:05:57 -0800
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.4pre) Gecko/20090922 Fedora/3.0-3.9.b4.fc12 Thunderbird/3.0b4

On 12/18/2009 07:40 AM, malc wrote:
After fixing a bug (crop was done after reading the cr) i run some
openssl speed benchmarks, and, at least here on an MPC7447A, got a
speed degradation, tiny but consistent.

Well, you could try rendering the setcond with branches instead of logical operations. You'll still gain the benefit of not having ended the TCG basic block, and forced the stores of globals to their slots etc etc.

IN:
0x40082295:  movzbl (%eax),%eax
0x40082298:  cmp    $0x3d,%al
0x4008229a:  setne  %dl
0x4008229d:  test   %al,%al
0x4008229f:  je     0x400822d2

OP after liveness analysis:
  mov_i32 tmp2,eax
  qemu_ld8u tmp0,tmp2,$0xffffffff
  mov_i32 eax,tmp0
  movi_i32 tmp1,$0x3d
  mov_i32 tmp0,eax
  nopn $0x2,$0x2
  sub_i32 cc_dst,tmp0,tmp1
  movi_i32 tmp13,$0xff
  and_i32 tmp4,cc_dst,tmp13
  movi_i32 tmp13,$0x0
  setcond_i32 tmp0,tmp4,tmp13,ne
  movi_i32 tmp14,$0xff
  and_i32 tmp13,tmp0,tmp14

....

OUT: [size=204]
0x601051b0:  lwz     r14,0(r27)
0x601051b4:  lbzx    r14,0,r14
0x601051b8:  mr      r15,r14
0x601051bc:  addi    r15,r15,-61
0x601051c0:  andi.   r15,r15,255
0x601051c4:  cmpwi   cr6,r15,0
0x601051c8:  crnot   4*cr7+eq,4*cr6+eq
0x601051cc:  mfcr    r0
0x601051d0:  rlwinm  r15,r0,31,31,31
0x601051d4:  andi.   r15,r15,255

...

So the fact that setcond produces 0/1 was never communicated to the
tcg, not that i would claim that it's possible at all...

It isn't.

And anyway, if you look at the opcodes generated without the setcond patch you'll see that and 255 in there as well. Some more surgery on the i386 translator could probably get rid of that. All I replaced were sequences of

  brcond c1,c2,$lab_true
  movi dest,0
  br $lab_over
  movi dest,1



r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]