avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] GCC-AVR Register optimisations


From: andrewhutchinson
Subject: Re: [avr-gcc-list] GCC-AVR Register optimisations
Date: Thu, 10 Jan 2008 13:59:44 -0500

Ok, I checked instruction patterns in GCC AVR.MD and use of ADIW registers is 
marked "!" for ADD 16bits, ADD 32 bits and TEST 16bits

This means that it will not be used by reload and it will be a second/third 
choice elsewhere. Which seems to match your observations!

It also will push allocation away from R24-R30 - which might explain why R14 
was getting used.

I looked back thru change history and this has been their since original.  

It could be that it fixes a problem that no longer exists. For sure it will 
produce poor code as you describe

It so happens I noticed this the other day and removed it from my working copy 
(to see if anything bad happened and also to smoke test my patch for 
BASE_POINTER register spill - since I wanted to force more use of pointer 
registers)

Nothing bad has happened so far.

I will post results latter.







---- Wouter van Gulik <address@hidden> wrote: 
> Wouter van Gulik schreef:
> 
> > 
> > Note that in some cases it could be very interesting to use r27, or Y, 
> > register.
> > 
> 
> Should have written R28 of course.
> 
> Since gcc seems down at the moment I did some more testing.
> 
> Now consider this example:
> void main(void)
> {
>       char *p = x;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
>       foo(p); p+=65;
> }
> This must be done using a subi/sbci pare.
> 
> But the compiler now seems to realize that p is a constant offset to x. 
> So we now get:
> 
> main:
> /* prologue: frame size=0 */
>       push r16
>       push r17
> /* prologue end (size=2) */
>       lds r16,x
>       lds r17,(x)+1
>       movw r24,r16
>       call foo
>       movw r24,r16
>       subi r24,lo8(-(65))
>       sbci r25,hi8(-(65))
>       call foo
>       movw r24,r16
>       subi r24,lo8(-(130))
>       sbci r25,hi8(-(130))
> 
> Here x is stored in r16 and the cumulative offset is added to R24
> 
> But if the compiler can realize this... Then why not do this for adds 
> within the adiw range?!?
> So for p++/p+=1 we would get something like:
> 
>       movw r24, r16
>       adiw r24, 1
>       call foo
>       movw r24, r16
>       adiw r24, 2
> etc..
> 
> This is just as small as the earlier suggested use of R28!
> 
> Wouter
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]