qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/9] tcg/optimizer: check types in copy propagat


From: Aurelien Jarno
Subject: Re: [Qemu-devel] [PATCH 2/9] tcg/optimizer: check types in copy propagation
Date: Thu, 20 Sep 2012 07:54:24 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, Sep 19, 2012 at 02:33:46PM -0700, Richard Henderson wrote:
> On 09/19/2012 01:00 PM, Aurelien Jarno wrote:
> > The copy propagation doesn't check the types of the temps during copy
> > propagation. However TCG is using the mov_i32 for the i64 to i32
> > conversion and thus the two are not equivalent.
> > 
> > With this patch tcg_opt_gen_mov() doesn't consider two temps with
> > different types as copies anymore.
> > 
> > So far it seems the optimization was not aggressive enough to trigger
> > this bug, but it will be triggered later in this series once the copy
> > propagation is improved.
> 
> Exactly where/how does this manifest as problematic?

The problem arise when a 64-bit value is moved to a 32-bit value, and
later this 64-bit value is reused. With the copy propagation if you
consider them as identical, the 32-bit value might be used instead of
the 64-bit value. This happens for example for the umull instruction on
arm:

| OP:
|  ---- 0xf67e5ea0
|  mov_i32 tmp5,r3
|  mov_i32 tmp6,r8
|  ext32u_i64 tmp7,tmp5
|  ext32u_i64 tmp8,tmp6
|  mul_i64 tmp7,tmp7,tmp8

tmp7 is a 64-bit value

|  mov_i32 tmp6,tmp7

Now transfered to a 32-bit tmp

|  mov_i32 r1,tmp6

and a 32-bit register.

|  movi_i64 tmp8,$0x20
|  shr_i64 tmp7,tmp7,tmp8
|  mov_i32 tmp6,tmp7
|  mov_i32 r3,tmp6
|  goto_tb $0x1
|  movi_i32 pc,$0xf67e5ea4
|  exit_tb $0x7f16948b0299
| 
| OP after optimization and liveness analysis:
|  ---- 0xf67e5ea0
|  nopn $0x2,$0x2
|  nopn $0x2,$0x2
|  ext32u_i64 tmp7,r3
|  ext32u_i64 tmp8,r8
|  mul_i64 tmp7,tmp7,tmp8
|  mov_i32 tmp6,tmp7
|  mov_i32 r1,tmp6
|  movi_i64 tmp8,$0x20
|  shr_i64 tmp7,r1,tmp8

Here, tmp7 is replaced by r1. However r1 only contains the 32-bit low
part of tmp7, thus returning 0.

|  mov_i32 tmp6,tmp7
|  mov_i32 r3,tmp6
|  goto_tb $0x1
|  movi_i32 pc,$0xf67e5ea4
|  exit_tb $0x7f16948b0299
|  end


> We do this mov_i32 trick from i64->i32 when we're truncating.
> Given that we're not (yet) targeting mips64 and having to
> maintain 32-bit sign-extended quantities, I can't see how
> that would matter.

It does matter on architectures using different instructions to work on
the 32 part of a register the registers. Actually in the case a above
with a register shift right, shr_i32 and shr_i64 are always implemented
differently to not shift bits from the 32bit high part to the low part.

> We do the i32->i64 trick immediately before a proper extension.
> 
> In either case I can't see how plain copy propagation should
> matter until some other operation is involved.  So, do we have
> some other data propagation bug that is being masked here?

I think it's a bug of the copy propagation considering these registers
are equivalent. If on x86-64 you replace all accesses to rax by eax,
your code will be smaller, but I am not sure it is going to work 
correctly.

The only latent bug I can see there, is having mov_i32 both being used
to move data between 32-bit temps and between a 64-bit temp and a 32-bit
temp.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
address@hidden                 http://www.aurel32.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]