[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gsl] gsl_sf_bessel_Jn_e performance

From: Jonny Taylor
Subject: Re: [Bug-gsl] gsl_sf_bessel_Jn_e performance
Date: Thu, 12 Jun 2008 19:14:25 +0100

I don't know what the policy is regarding optimizations which
(slightly) decrease code readability, but there's a simple change to
the downward recurrence in gsl_sf_bessel_Jn_e which doubles the speed
of the function (almost all the remaining time is spent in
gsl_sf_bessel_J_CF1) for n of order 10-40.

Thanks for your email.  That is interesting.  Can you give a few more
details about the compilation options you used, compiler version and
the platform.
My gsl install compiles with:
gcc -DHAVE_CONFIG_H -I. -I.. -I.. -g -O2 -c bessel_Jn.c -o bessel_Jn.o
which are the out-of-the-box options.

Issue observed on PPC G5, OS X, gcc 4.0.1, and also on x86_64, Ubuntu, gcc 4.1.3.

Did you see how much of the benefit comes from replacing 2/x by a
constant compared with keeping the value of k in a double?  The
optimisation of replacing (2/x) by a constant would be something I
would expect GCC to deduce at some level.
Further investigation reveals that gcc CAN optimize 2/x, but only if strict IEEE compliance is disabled [-ffast-math] which is not the out- of-the-box option for gsl compilation.

With -O2 and no ffast-math there is roughly 30% improvement for either of the optimizations on their own, and 40% improvement if they are both made together (I speculate the lack of further improvement is because loop overhead is now the bottleneck). OK, not quite doubling with the parameters I used for this test, but still not to be sneezed at.

With -O3 -ffast-math there is actually still a slight improvement if I pull out 2/x (haven't looked at why exactly). Roughly 30% improvement obtained from the shadow variable optimization.

Hope this helps

reply via email to

[Prev in Thread] Current Thread [Next in Thread]