|
From: | Jonny Taylor |
Subject: | Re: [Bug-gsl] gsl_sf_bessel_Jn_e performance |
Date: | Thu, 12 Jun 2008 19:14:25 +0100 |
I don't know what the policy is regarding optimizations which (slightly) decrease code readability, but there's a simple change to the downward recurrence in gsl_sf_bessel_Jn_e which doubles the speed of the function (almost all the remaining time is spent in gsl_sf_bessel_J_CF1) for n of order 10-40.Thanks for your email. That is interesting. Can you give a few more details about the compilation options you used, compiler version and the platform.
My gsl install compiles with: gcc -DHAVE_CONFIG_H -I. -I.. -I.. -g -O2 -c bessel_Jn.c -o bessel_Jn.o which are the out-of-the-box options.Issue observed on PPC G5, OS X, gcc 4.0.1, and also on x86_64, Ubuntu, gcc 4.1.3.
Further investigation reveals that gcc CAN optimize 2/x, but only if strict IEEE compliance is disabled [-ffast-math] which is not the out- of-the-box option for gsl compilation.Did you see how much of the benefit comes from replacing 2/x by a constant compared with keeping the value of k in a double? The optimisation of replacing (2/x) by a constant would be something I would expect GCC to deduce at some level.
With -O2 and no ffast-math there is roughly 30% improvement for either of the optimizations on their own, and 40% improvement if they are both made together (I speculate the lack of further improvement is because loop overhead is now the bottleneck). OK, not quite doubling with the parameters I used for this test, but still not to be sneezed at.
With -O3 -ffast-math there is actually still a slight improvement if I pull out 2/x (haven't looked at why exactly). Roughly 30% improvement obtained from the shadow variable optimization.
Hope this helps Jonny
[Prev in Thread] | Current Thread | [Next in Thread] |