[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [avr-libc-dev] Pow Function in avr8
From: |
Georg-Johann Lay |
Subject: |
Re: [avr-libc-dev] Pow Function in avr8 |
Date: |
Fri, 30 Nov 2012 00:06:24 +0100 |
User-agent: |
Thunderbird 2.0.0.24 (Windows/20100228) |
Thomas, George schrieb:
If you know that the exponent is integral you may want to have a look at
GCC's __builtin_powi* functions.
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
You asked for a base of 2, maybe ldexp() fits your use case.
Handling integral exponents separately will increase the code size
because the exponent must be checked at run time and extra code must be
executed in that case.
[...]
The builtin which called __powisf2 in libgcc.
The code size and cycles obtained in avrstudio6 were as follows.
Function Size Cycles
pow 152 5525
__powisf2 210 452
The sizes are misleading. They just take into account the raw functions
but not the sizes of the dependencies like exp, log, division, prologue
helper, etc.
Also the code in libc seems to have checks already to check if its
integer so would calling the libgcc function be advisable ?
I am not sure about that.
Most applications need a small code size, but with each special handling
that is needed at run time you will drag more dependency functions from
the libraries.
And I still wonder if such an optimization is "important":
If the user knows that the exponent is integral he can use powi() in the
first place.
If, on the other hand, we don't know anything about the exponent then a
reasonable assumption is that it is very unlikely to hit an integral
exponent, thus the expected speed gain will be really small because it
is very unlikely that the input is an element of some null set...
Let's have a look at the raw sizes.
__powisf2 is open coded in C in libgcc2.c [1], basically
float
__powisf2 (float x, int m)
{
unsigned int n = m < 0 ? -m : m;
float y = n % 2 ? x : 1;
while (n >>= 1)
{
x = x * x;
if (n % 2)
y = y * x;
}
return m < 0 ? 1/y : y;
}
Compiling with 4.6.2 and -mcall-prologues -Os gives the size you
mentioned above. With 4.7.2 and also -fno-split-wide-types we see a
size of 138 which is 33% less code size. Presumably we see PR52278 [2]
in action.
This means there is much room for improvement; an assembler programmer
will easily reduce the size below 100 without the need of prologue /
epilogue helpers.
log (and thus pow) are using a power series expansion which does not
converge really good and has a small radius of convergence of 1, namely
the Mercator's series [3].
There are other representations of log that might yield better results
like area tangens hyperbolicus or cubic splines.
Maybe someone wants to go through the hassle to implement a better
version of pow and do all the implementing and (regression) testing and
benchmarking and support and whatever again to speed up the stuff or
gain 2 or 3 LSBs...
Johann
--
[1]
http://gcc.gnu.org/viewcvs/trunk/libgcc/libgcc2.c?revision=184997&view=markup#l1744
[2]
http://gcc.gnu.org/PR52278
[3]
http://en.wikipedia.org/wiki/Mercator_series
- [avr-libc-dev] Pow Function in avr8, Thomas, George, 2012/11/28
- Re: [avr-libc-dev] Pow Function in avr8, Jan Waclawek, 2012/11/28
- Re: [avr-libc-dev] Pow Function in avr8, Thomas, George, 2012/11/28
- Re: [avr-libc-dev] Pow Function in avr8, Jan Waclawek, 2012/11/28
- Re: [avr-libc-dev] Pow Function in avr8, Thomas, George, 2012/11/29
- Re: [avr-libc-dev] Pow Function in avr8, Georg-Johann Lay, 2012/11/29
- Re: [avr-libc-dev] Pow Function in avr8, Thomas, George, 2012/11/29
- Re: [avr-libc-dev] Pow Function in avr8,
Georg-Johann Lay <=
Re: [avr-libc-dev] Pow Function in avr8, Georg-Johann Lay, 2012/11/28
Re: [avr-libc-dev] Pow Function in avr8, Amine Najahi, 2012/11/28