[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [avr-gcc-list] Missed optimization or am *I* missing something?
From: |
Weddington, Eric |
Subject: |
RE: [avr-gcc-list] Missed optimization or am *I* missing something? |
Date: |
Thu, 23 Sep 2010 07:33:20 -0600 |
Hi Johannes,
My best guess is that this is a missed optimization. The compiler is probably
selecting different patterns based on the different code. It recognizes the
redundant adding of zero in the operations below when WEIRD is 0 and rightly
throws them out. When WEIRD is 1, those operations have to be kept in, and so
different patterns are selected, which are either more optimal, or triggers
different kinds of optimizations to take place afterwards, hence producing
better code even though there are more operations. I noticed that the longer
code generates a lot more code with the pointer registers (Z and X in this
case).
I noticed too that the more optimal case below is still not the best code: it
looks like the index variable 'i' in your for loop is being treated as a 16-bit
value, when you explicitly declare it as an 8-bit unsigned char. Fixing that
could save another 3 instructions.
Abnikant or Vidya (on CC), could one you please confirm this missed
optimization on the latest GCC trunk/4.6 and if so, fill out an internal bug
report for this issue? Thanks.
Eric Weddington
> -----Original Message-----
> From:
> address@hidden
> [mailto:address@hidden
> org] On Behalf Of Johannes Bauer
> Sent: Thursday, September 23, 2010 5:05 AM
> To: address@hidden
> Subject: [avr-gcc-list] Missed optimization or am *I* missing
> something?
>
> Hello list,
>
> I've encountered a strange problem of which I do not know
> exactly wether
> I cause it or if avr-gcc misses an optimization - this is why
> I refer to
> you for help. This is a full example problem demonstrating the issue:
>
> #include <stdint.h>
> #include <avr/io.h>
>
> //#define WEIRD 0
> #define FOOBUFSIZE 8
> uint16_t fooBoundedBuffer[FOOBUFSIZE];
> uint16_t FOO(void) {
> uint16_t boundedBufferValueSum;
> unsigned char offset;
> unsigned char i;
> offset = DMA.CH0.TRFCNT - WEIRD;
> boundedBufferValueSum = 0;
> for (i = 0; i < 4; i++) {
> boundedBufferValueSum +=
> fooBoundedBuffer[(offset + i + WEIRD) %
> FOOBUFSIZE];
> }
> return boundedBufferValueSum;
> }
>
> int main() {
> return 0;
> }
>
> if this is compiled with
>
> avr-gcc -mmcu=atxmega128a1 -Os -DWEIRD=0 -o foo example.c &&
> avr-nm foo
> -S | grep FOO
>
> it shows
>
> 00000244 0000006e T FOO
>
> i.e. FOO is 110 bytes long. Compile with
>
> avr-gcc -mmcu=atxmega128a1 -Os -DWEIRD=1 -o foo example.c &&
> avr-nm foo
> -S | grep FOO
>
> and it shows
>
> 00000244 00000042 T FOO
>
> i.e. FOO is only 66 bytes long (44 bytes shorter). The resulting
> assembly in the short case is
>
> 00000244 <FOO>:
> 244: 80 91 14 01 lds r24, 0x0114
> 248: 90 91 15 01 lds r25, 0x0115
> 24c: 81 50 subi r24, 0x01 ; 1
> 24e: 40 e0 ldi r20, 0x00 ; 0
> 250: 50 e0 ldi r21, 0x00 ; 0
> 252: 20 e0 ldi r18, 0x00 ; 0
> 254: 30 e0 ldi r19, 0x00 ; 0
> 256: 90 e0 ldi r25, 0x00 ; 0
> 258: bc 01 movw r22, r24
> 25a: 6f 5f subi r22, 0xFF ; 255
> 25c: 7f 4f sbci r23, 0xFF ; 255
> 25e: fb 01 movw r30, r22
> 260: e2 0f add r30, r18
> 262: f3 1f adc r31, r19
> 264: e7 70 andi r30, 0x07 ; 7
> 266: f0 70 andi r31, 0x00 ; 0
> 268: ee 0f add r30, r30
> 26a: ff 1f adc r31, r31
> 26c: e0 50 subi r30, 0x00 ; 0
> 26e: f0 4e sbci r31, 0xE0 ; 224
> 270: 80 81 ld r24, Z
> 272: 91 81 ldd r25, Z+1 ; 0x01
> 274: 48 0f add r20, r24
> 276: 59 1f adc r21, r25
> 278: 2f 5f subi r18, 0xFF ; 255
> 27a: 3f 4f sbci r19, 0xFF ; 255
> 27c: 24 30 cpi r18, 0x04 ; 4
> 27e: 31 05 cpc r19, r1
> 280: 71 f7 brne .-36 ; 0x25e
> <FOO+0x1a>
> 282: ca 01 movw r24, r20
> 284: 08 95 ret
>
> and in the long case
>
> 00000244 <FOO>:
> 244: e0 e0 ldi r30, 0x00 ; 0
> 246: f1 e0 ldi r31, 0x01 ; 1
> 248: 44 89 ldd r20, Z+20 ; 0x14
> 24a: 55 89 ldd r21, Z+21 ; 0x15
> 24c: 50 e0 ldi r21, 0x00 ; 0
> 24e: da 01 movw r26, r20
> 250: 12 96 adiw r26, 0x02 ; 2
> 252: a7 70 andi r26, 0x07 ; 7
> 254: b0 70 andi r27, 0x00 ; 0
> 256: 60 e0 ldi r22, 0x00 ; 0
> 258: 70 e2 ldi r23, 0x20 ; 32
> 25a: aa 0f add r26, r26
> 25c: bb 1f adc r27, r27
> 25e: a6 0f add r26, r22
> 260: b7 1f adc r27, r23
> 262: fa 01 movw r30, r20
> 264: 31 96 adiw r30, 0x01 ; 1
> 266: e7 70 andi r30, 0x07 ; 7
> 268: f0 70 andi r31, 0x00 ; 0
> 26a: ee 0f add r30, r30
> 26c: ff 1f adc r31, r31
> 26e: e6 0f add r30, r22
> 270: f7 1f adc r31, r23
> 272: 2d 91 ld r18, X+
> 274: 3c 91 ld r19, X
> 276: 80 81 ld r24, Z
> 278: 91 81 ldd r25, Z+1 ; 0x01
> 27a: 28 0f add r18, r24
> 27c: 39 1f adc r19, r25
> 27e: fa 01 movw r30, r20
> 280: e7 70 andi r30, 0x07 ; 7
> 282: f0 70 andi r31, 0x00 ; 0
> 284: ee 0f add r30, r30
> 286: ff 1f adc r31, r31
> 288: e6 0f add r30, r22
> 28a: f7 1f adc r31, r23
> 28c: 80 81 ld r24, Z
> 28e: 91 81 ldd r25, Z+1 ; 0x01
> 290: 28 0f add r18, r24
> 292: 39 1f adc r19, r25
> 294: 4d 5f subi r20, 0xFD ; 253
> 296: 5f 4f sbci r21, 0xFF ; 255
> 298: 47 70 andi r20, 0x07 ; 7
> 29a: 50 70 andi r21, 0x00 ; 0
> 29c: 44 0f add r20, r20
> 29e: 55 1f adc r21, r21
> 2a0: 64 0f add r22, r20
> 2a2: 75 1f adc r23, r21
> 2a4: fb 01 movw r30, r22
> 2a6: 80 81 ld r24, Z
> 2a8: 91 81 ldd r25, Z+1 ; 0x01
> 2aa: 28 0f add r18, r24
> 2ac: 39 1f adc r19, r25
> 2ae: c9 01 movw r24, r18
> 2b0: 08 95 ret
>
> The used compiler is 4.3.4 with the CVS patches from the BSD
> repository.
>
> Can this be explained somehow?
>
> Kind regards,
> Johannes
>
> _______________________________________________
> AVR-GCC-list mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
>