freepooma-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freepooma-devel] [PATCH] PathScale EKOPath compiler support


From: Roman Krylov
Subject: Re: [Freepooma-devel] [PATCH] PathScale EKOPath compiler support
Date: Thu, 17 Mar 2005 16:50:15 +0300
User-agent: Mozilla Thunderbird 1.0 (X11/20041206)

Hi.
By the way I have:

./Loop18  --no-diags
             C
N          restrict         C          CppTran       PoomaII
100        374.37        372.21        487.60         76.82
215        545.30        493.62        715.59        105.84
464        580.30        579.18        629.11        290.75
1000       565.24        577.71        626.62        365.99
2154       586.90        603.90        592.03        411.40
4641       559.57        602.78        621.63        421.76
10000      204.52        209.03        176.68        158.57
21544       79.59         80.57         78.66         78.08
46415       80.35         79.98         77.39         78.65
100000      77.81         78.33         79.65         75.44

And what are these numbers? I haven't found the answer in reference about Benchmark class.
Roman.

On Wed, 16 Mar 2005, Bryan O'Sullivan wrote:

On Wed, 2005-03-16 at 09:22 +0100, Richard Guenther wrote:

Thanks, I added this to the HEAD and the r2 branch.  I'm curious,
do you have any numbers for the benchmarks like ABCTest/BlitzLoops and
Doof?  I don't have pathscale compilers available here, and we don't have
amd64 systems here anyway.
I'm starting to run the benchmarks at the moment.  I've never seen any
benchmark numbers for other compilers or systems, though, so I wouldn't
know which numbers are good or bad.  Can you point me at some?  The
more, the better, especially OpenMP and MPI.

Well in all the shipped benchmarks you'll get output like
(BlitzLoops):

BlitzLoops> ./LINUXgcc/Loop18 --no-diags
             C
N          restrict         C          CppTran       PoomaII
100        1461.20        1428.96        982.61        253.63
215        1609.77        1609.77        1300.28        432.08
464        1779.71        1544.34        1396.56        557.08
1000       1576.69        1576.69        1105.65        715.73
2154       1656.58        1570.21        1111.69        819.32
4641       1549.80        1494.28        965.87        786.17
10000      1521.87        1463.46        917.38        662.33
21544      441.18        509.31        596.76        460.04
46415      261.34        268.60        273.99        253.92
100000     267.93        281.90        277.57        263.16

where the ultimate goal is to have CppTran and PoomaII numbers
be equal or better (higher - these are sort of MFLOPs) than
the C and C restrict numbers.  Here we assume that the compilers
are already very good at optimizing the C code, which is usually
true.

Note that the comparison is not fair in all tests as sometimes
the Pooma tests do split loops (which of course you could fuse
theoretically).

These tests are also good for testing OpenMP performance, not
for MPI performance (though that does not depend on the compiler
too much anyway).

Be sure to play with the --sim-params parameter to get a
reasonable problem size that covers L1/L2 and main memory.
Most interesting are the numbers for problem sizes that still
fit in L2 cache, as here you can clearly see the abstraction
penalty most.

From a good compiler I expect the C and CppTran performance
numbers to match and the PoomaII numbers match at least for
the main memory sized problems.

If you're working agains CVS HEAD of FreePOOMA there is some
library level optimization for unit stride access, this may
help optimize the innermost loops.

The numbers above for BlitzLoops are for gcc-3.4, like the
following for ABCTest:

ABCTest> ./LINUXgcc/ABC --no-diags --run-impls 0 1 2 3 --sim-params 10 2 3
             C                        CppTran       PoomaII
N          restrict         C             Bk            Bk
10         953.09        953.10        513.33        105.56
21         1129.77        1167.20        563.80        190.12
46         1076.90        1089.12        564.65        225.16
100        1098.92        1064.97        536.62        245.19
215        278.22        268.69        261.83        179.41
464        269.99        274.75        271.01        187.12
1000       264.26        270.93        255.07        163.58

and Doof3d:

Doof3d> ./LINUXgcc/Doof3d --no-diags --sim-params 10 1 2
             C                                      PoomaII       PoomaII
N          restrict         C          CppTran        NoOpt          Opt
10         624.94        628.32         82.83        113.32         82.34
31         638.89        645.34         82.88        154.41         83.11
100        644.08        635.40         82.44        155.17         81.93


Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/



_______________________________________________
Freepooma-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/freepooma-devel







reply via email to

[Prev in Thread] Current Thread [Next in Thread]