avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[avr-gcc-list] Shorter code?


From: Ruud Vlaming
Subject: [avr-gcc-list] Shorter code?
Date: Thu, 12 Jun 2008 15:27:42 +0200
User-agent: KMail/1.9.1

Hi

Normally gcc generates well optimized code, but
sometimes i wunder how gcc can do simple things
so complicated.

Here is an example, 

uint16_t genGetTickCount(void)
{ return (((uint16_t) uxTickCount.HighByte) << 8) | (uint16_t) 
(uxTickCount.LowByte) ; }

generates

00000768 <genGetTickCount>:
 768: 80 91 1d 01  lds  r24, 0x011D
 76c: 20 91 1c 01  lds  r18, 0x011C
 770: 99 27        eor  r25, r25
 772: 98 2f        mov  r25, r24
 774: 88 27        eor  r24, r24
 776: 33 27        eor  r19, r19
 778: 82 2b        or   r24, r18
 77a: 93 2b        or   r25, r19
 77c: 08 95        ret

whereas it could have been 12 bytes (!) shorter:
00000768 <genGetTickCount>:
 768: 80 91 1d 01  lds  r25, 0x011D
 76c: 20 91 1c 01  lds  r24, 0x011C
 770: 08 95        ret

Is there a way  to write the methode defined above in C to make the 
generate this assembly? Some special combine function maybe?


Further, i dont know how much intelligence you may expect from the
compiler, but for example, first cleaning r25, and directly afterwards 
filling it with r24 seems really a waste of effort. By direct inspection, 
thus _without_ any knowledge what is going on, this code could be 
reduced in the following simple steps (ignore line numbers):

00000768 <genGetTickCount>:
 768: 80 91 1d 01  lds  r24, 0x011D
 76c: 20 91 1c 01  lds  r18, 0x011C
 770: 99 27        eor  r25, r25  //remove this, since it directly overwritten 
afterwards
 772: 98 2f        mov  r25, r24
 774: 88 27        eor  r24, r24
 776: 33 27        eor  r19, r19
 778: 82 2b        or   r24, r18
 77a: 93 2b        or   r25, r19  //remove this since "or" with zero does not 
change the value of r25
 77c: 08 95        ret

00000768 <genGetTickCount>:
 768: 80 91 1d 01  lds  r24, 0x011D
 76c: 20 91 1c 01  lds  r18, 0x011C
 772: 98 2f        mov  r25, r24
 774: 88 27        eor  r24, r24
 776: 33 27        eor  r19, r19  //remove this, the register is unused.
 778: 82 2b        or   r24, r18  // change ito "mov" since r24 is zero
 77c: 08 95        ret

00000768 <genGetTickCount>:
 768: 80 91 1d 01  lds  r24, 0x011D //directly fill this with r25, since the 
value r24 is destroyed after the move
 76c: 20 91 1c 01  lds  r18, 0x011C 
 772: 98 2f        mov  r25, r24    //remove this since r25 will be filled 
directly
 774: 88 27        eor  r24, r24    //remove this, since it directly 
overwritten afterwards
 778: 82 2b        mov   r24, r18  
 77c: 08 95        ret

00000768 <genGetTickCount>:
 768: 80 91 1d 01  lds  r25, 0x011D 
 76c: 20 91 1c 01  lds  r18, 0x011C  //direcly fill this with r24 since r18 is 
unsed after the move
 778: 82 2b        mov   r24, r18    //remove this since it r24 will be filled 
directly
 77c: 08 95        ret

00000768 <genGetTickCount>:
 768: 80 91 1d 01  lds  r25, 0x011D 
 76c: 20 91 1c 01  lds  r24, 0x011C 
 77c: 08 95        ret

Could such post compiler optimization steps be integrated in the compiler?

Like to hear your comments.

Ruud.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]