help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

BLAS lib performance (was Re: MPI (was Re: BIG libraries ....))


From: Fredrik Lingvall
Subject: BLAS lib performance (was Re: MPI (was Re: BIG libraries ....))
Date: Wed, 05 Oct 2005 21:54:09 +0200
User-agent: Mozilla Thunderbird 1.0.6 (X11/20051004)



Hi, did you happen to build octave linking to it, and if so, what were the effects on performance? Cheers, M.



If you carry this out, please report back on your results (including
performance).



I did this some time ago:

http://www.octave.org/mailing-lists/help-octave/2004/168


I have done a quick performance test with ATLAS vs. GOTO BLAS on a Pentium M machine (DELL D810) and a dual Pentium 4 Prescott machine (DELL Precision 670) using the code below (octave 2.1.71):

T = [];
N = [];

%for n=[500:500:5000 1000]
for n=[500:500:5000]
 A = randn(n,n);
 tic;
   B=A*A;
 t = toc;

 fprintf('Elapsed time (n=%d) %f\n',n,t);

 T = [T t];
 N= [N n];
end

Results:

**************** 2.0 GHz Pentium M 760 (2MB cache), Dell Latitude D810 laptop, 2 GB ram:

* ATLAS (libatlas_Linux_P4SSE2.so)

Elapsed time (n=500) 0.192438
Elapsed time (n=1000) 1.481611
Elapsed time (n=1500) 4.929739
Elapsed time (n=2000) 11.702435
Elapsed time (n=2500) 22.817465
Elapsed time (n=3000) 39.251930
Elapsed time (n=3500) 62.606858
Elapsed time (n=4000) 93.108703
Elapsed time (n=4500) 132.630548
Elapsed time (n=5000) 181.885659

* GOTO BLAS (libgoto_northwood32p-r1.00.so)

Elapsed time (n=500) 0.165478
Elapsed time (n=1000) 1.260498
Elapsed time (n=1500) 4.224197
Elapsed time (n=2000) 9.870070
Elapsed time (n=2500) 19.341520
Elapsed time (n=3000) 33.227217
Elapsed time (n=3500) 52.873640
Elapsed time (n=4000) 78.452588
Elapsed time (n=4500) 112.102844
Elapsed time (n=5000) 152.980586


**************** Dual 2.8 GHz XEON (1MB cache), Dell Precision 670, 3 GB ram:

* Treaded ATLAS (libATLAS_Linux_P4ESSE3_2.so)

Elapsed time (n=500) 0.069242
Elapsed time (n=1000) 0.259086
Elapsed time (n=1500) 0.816377
Elapsed time (n=2000) 1.887494
Elapsed time (n=2500) 3.930341
Elapsed time (n=3000) 6.259690
Elapsed time (n=3500) 9.992612
Elapsed time (n=4000) 14.733824
Elapsed time (n=4500) 21.107499
Elapsed time (n=5000) 29.036249
Elapsed time (n=10000) 227.612336

* GOTO (libgoto_prescott32p-r1.00.so)

Elapsed time (n=500) 0.067996
Elapsed time (n=1000) 0.252867
Elapsed time (n=1500) 0.893660
Elapsed time (n=2000) 1.821469
Elapsed time (n=2500) 3.519627
Elapsed time (n=3000) 6.064184
Elapsed time (n=3500) 9.580817
Elapsed time (n=4000) 14.161038
Elapsed time (n=4500) 20.221677
Elapsed time (n=5000) 27.726569
Elapsed time (n=10000) 218.574903

* GOTO (libgoto_prescott32p-r1.00.so) with export OMP_NUM_THREADS=1

Elapsed time (n=500) 0.081871
Elapsed time (n=1000) 0.446672
Elapsed time (n=1500) 1.482445
Elapsed time (n=2000) 3.452727
Elapsed time (n=2500) 6.744513
Elapsed time (n=3000) 11.551446
Elapsed time (n=3500) 18.490894
Elapsed time (n=4000) 27.245389
Elapsed time (n=4500) 39.056861
Elapsed time (n=5000) 53.391296
Elapsed time (n=10000) 422.807981

On the laptop GOTO blas was roughly 18% faster than ATLAS and GOTO blas was about 4%
faster on the dual XEON. The speedup by using two threads instead one was
roughly 1.9 times on the dual XEON machine (with GOTO BLAS).

Fredrik

BTW. I noticed that matlab utilizes that B=A*A' (and A'*A) is symmetric (A*A' only takes half the time to compute compared to A*A). Is this functionality available in octave as well?





-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]