avr-libc-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-libc-dev] -O3? -Os?


From: E. Weddington
Subject: Re: [avr-libc-dev] -O3? -Os?
Date: Mon, 16 Dec 2002 09:18:32 -0700

On 16 Dec 2002 at 16:06, Joerg Wunsch wrote:

> I've always been curious what we actually gain by using -O3 for the
> `larger' AVR devices when compiling the library.  So i finally wrote a
> test case, and ran it on an ATmega128.  In order to create a test job
> that might profit as best as possible from any speed enhancement made
> inside avr-libc, i decided that sorting strings would serve this task
> quite well: it contains calls to library functions that are
> optimizable, and that take a bit of CPU in order to execute (qsort()).
> I used qsort to sort an array of strings (boldly borrowed the first
> lines from the famous "Bastard Operator from Hell" for it :), once
> using the normal strcmp() function, and another time using a function
> effectively sorting the array by string size.
>
> The resulting object file has been linked against a current avr-libc,
> where the library was configured and compiled with different
> optimization options (avrlib_opt_speed in configure.in).
>
> Here's the results:
>
> -O3:
>
> % avr-size test.out
>    text    data     bss     dec     hex filename
>    6898    1980      10    8888    22b8 test.out
>
> time for qsort(strcmp): 0.000903 seconds.
> time for qsort(strlencmp): 0.019705 seconds.
> done.
>
> -mcall-prologues -Os:
>
> % avr-size test.out
>    text    data     bss     dec     hex filename
>    6474    1980      10    8464    2110 test.out
>
> time for qsort(strcmp): 0.000972 seconds.
> time for qsort(strlencmp): 0.020069 seconds.
> done.
>
> -Os:
>
> % avr-size test.out
>    text    data     bss     dec     hex filename
>    6618    1980      10    8608    21a0 test.out
>
> time for qsort(strcmp): 0.000955 seconds.
> time for qsort(strlencmp): 0.020069 seconds.
> done.
>
> -O2:
>
> % avr-size test.out
>    text    data     bss     dec     hex filename
>    6666    1980      10    8656    21d0 test.out
>
> time for qsort(strcmp): 0.000972 seconds.
> time for qsort(strlencmp): 0.020069 seconds.
>
>
> It's interesting to note that all attempts to modify the flags except
> -O3 basically gain nothing at all in terms of speed, with
> -mcall-prologues -Os (our default for the `small' AVR devices)
> yielding the smallest code size.  (The difference between 955 µs and
> 972 µs ist just a single timer-tick only, so take that with a grain of
> salt.)
>
> For -O3, the code size is ~ 6 % larger (even more bloat if you
> consider that vfprintf() & Co. take up about 25 % of the text segment
> and are unaffected by the global -O settings since they use private,
> hand-crafted optimization flags).  The speed gain is between 2 and 6
> %.
>
>
> My vote would be to use -mcall-prologues -Os for any of our targets.
>

Would you want to write up a new FAQ entry about this?
Eric




reply via email to

[Prev in Thread] Current Thread [Next in Thread]