[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [pooma-dev] Temporary copies do appear...??
From: |
Richard Guenther |
Subject: |
Re: [pooma-dev] Temporary copies do appear...?? |
Date: |
Fri, 21 May 2004 10:31:40 +0200 |
User-agent: |
Mozilla Thunderbird 0.5 (X11/20040313) |
Radek Pecher wrote:
Basically, simple algebraic expressions based on the tiny Vector class
do create temporary Full-engine copies of individual subexpressions,
as opposed to what POOMA claims to prevent. The following short main
code:
#include "Pooma/Arrays.h"
int main(int argc, char* argv[])
{
Pooma::initialize(argc, argv);
Vector<2> v1(1, 2), v2;
v2 = v1*v1 + v1*v1;
Pooma::finalize();
return 0;
}
You are right that gcc 3.3 does not optimize the copy calls. But
compiling the above with g++-3.4 -O2 -fpeel-loops results in straight
line code. Using Intel 8.0 compiler the asm code is a bit obfuscated
and there are calls to destructors left (not inlining these seems to be
a common problem of the Intel compiler).
I don't know wether one can structurally avoid the extra constructor
calls inside the Vector code, but maybe you can have a look at it? This
is certainly a point where optimization will be useful (if not for
compilation speed).
g++ -ftemplate-depth-60 -Drestrict=__restrict__ -fno-exceptions
-DNOPAssert -DNOCTAssert -O2 -fno-default-inline -funroll-loops
-fstrict-aliasing -o Main Main.cpp -I$HOME/lib/Optim/POOMA/linux/lib/
PoomaConfiguration-gcc -I$HOME/lib/Optim/POOMA/linux/src -I$HOME/lib/
Optim/POOMA/linux/lib -fno-exceptions -L$HOME/lib/Optim/POOMA/linux/
lib -lpooma-gcc -lm
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Also, if you are using gcc, you may consider applying the leafify patch
to your gcc distribution available at
http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/
and making the POOMA evaluators use it (I can provide a patch to you).
That's worth about 50% performance increase.
Hope that helps,
Richard.