|
From: | Elias Mårtenson |
Subject: | Re: [Bug-apl] first shot at parallel APL |
Date: | Fri, 24 Oct 2014 13:16:56 +0800 |
Hi Elias,
if you used a recent SVN then you need to set the thresholds (vector size) above which
parallel execution is performed:
(⍳4) ∘.time 10⋆⍳7
0 0 1 3 29 254 2593
0 0 1 2 25 252 2618
0 0 1 2 26 258 2682
0 0 1 2 26 263 2866
)COPY 5 FILE_IO
loading )DUMP file /usr/local/lib/apl/wslib5/FILE_IO.apl...
1 FIO∆set_dyadic_threshold '⋆' ⍝ returns the previous threshold for dyadic ⋆
8070450532247928832
(⍳4) ∘.time 10⋆⍳7
0 0 0 2 30 250 2590
0 0 0 1 15 149 1580
0 0 0 1 11 113 1225
0 3 0 0 12 103 1120
I am currently working on a benchmark workspace that determines the optimal thresholds
for the different scalar functions (and those thresholds will beome the future defaults). Right
now the default thresholds are so high that you will always have sequential execution.
/// Jürgen
On 09/26/2014 07:22 AM, Elias Mårtenson wrote:
I've tested this code, and I don't see much of an improvement as I increase the core count:
Given the following function:
∇Z ← NCPU time LEN;T;X;tmp⎕SYL[26;2] ← NCPUX ← LEN⍴2J2T ← ⎕TStmp ← X⋆XZ←1 1 1 24 60 60 1000⊥⎕TS - T∇
I'm running this command on my 8-core workstation:
(⍳8) ∘.time 10⋆⍳70 0 0 2 19 188 21390 0 1 2 19 189 21470 0 1 2 19 210 22560 0 0 2 19 194 24270 0 0 3 28 284 35810 0 0 3 27 280 35100 0 0 3 27 284 37540 0 0 3 27 279 3637
Regards,Elias
On 26 September 2014 13:05, Elias Mårtenson <address@hidden> wrote:
Thanks, I have merged the necessary changes.
Regards,Elias
On 22 September 2014 23:50, Juergen Sauermann <address@hidden> wrote:
Hi,
I have finished a first shot at parallel (i.e. multicore) GNU APL: SVN 480.
This version computes all scalar functions in parallel if the ravel length of the result exceeds 100.
This can make the computation of small (but still > 100) vectors slower than if they were computed sequentially.
Therefore parallel execution is not yet the default. To enable it:
./configure
make parallel
make
sudo make install
The current version uses some linux-specific features, which will be ported to other platforms later on (if possible).
./configure is supposed to detect this.
Some simple benchmarks are promising:
X←1000000⍴2J2 ⍝ 1 Mio complex numbers
⎕SYL[26;2]←1 ⍝ 1 core
T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
246
⎕SYL[26;2]←2 ⍝ 2 cores
T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
136
⎕SYL[26;2]←3 ⍝ 3 cores
T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
102
⎕SYL[26;2]←4 ⍝ 4 cores
T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
91
The next step will be to find the break-even points of all scalar functions, so that parallel execution is
only done when it promises some speedup.
Elias, the PointerCell constructor has got one more argument . I have updated emacs-mode and sql accordingly.
- you may want to sync back.
/// Jürgen
[Prev in Thread] | Current Thread | [Next in Thread] |