[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parallel APL Questions
From: |
enztec |
Subject: |
Re: Parallel APL Questions |
Date: |
Fri, 7 Feb 2020 13:18:33 -0700 |
what magic programming language is this?
On Fri, 7 Feb 2020 20:44:58 +0100
Dr. Jürgen Sauermann <address@hidden> wrote:
>
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> </head>
> <body text="#000000" bgcolor="#FFFFFF">
> <font face="Helvetica, Arial, sans-serif">Sorry, I have to correct
> myself. My CPUs were I5 and I7, not I7 and I9. Both with a CPU
> frequency of 3.2 GHz.</font><br>
> <br>
> <div class="moz-cite-prefix">On 2/7/20 8:25 PM, Dr. Jürgen Sauermann
> wrote:<br>
> </div>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> <font face="Helvetica, Arial, sans-serif">Hi Andrew,<br>
> <br>
> let me try to answer some of your questions inline below...<br>
> </font><br>
> <div class="moz-cite-prefix">On 2/7/20 6:35 PM, Andrew wrote:<br>
> </div>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <meta http-equiv="Content-Type" content="text/html;
> charset=UTF-8">
> Good evening
> <div class=""><br class="">
> </div>
> <div class="">This is my first post to this mailing list. It is
> a mainly some questions, not a bug report, so I hope it is
> appropriate to post it here. Apologies if not. (And
> apologies also for a rather long and rambling e-mail!)</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> No problem, youu found the right list.<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">I recently learned of Gnu APL and, having had some
> experience of APL on IBM mainframes in the 1980s, I was
> curious to know how it would work on a couple of my computers,
> and to use it to compare performance of two virtualised and
> emulated environments.</div>
> <div class=""><br class="">
> </div>
> <div class="">Firstly, I installed it on Ubuntu 18.04.3 running
> under VMWare Fusion on a 2.3GHz 8-core Intel i9. This is the
> latest SVN version, built using CORE_COUNT-WANTED=syl on
> ./configure (not make parallels, which gave me a problem with
> autoconf). I then used ⎕syl[26;2] to set the number of cores.</div>
> <div class=""><br class="">
> </div>
> <div class="">Using ⎕ai to obtain the compute time, I tried
> using 1 and 4 cores for brute force prime number counting,
> using this expression: r←⍴(1=+⌿0=r∘.∣r)/r←1↓⍳n</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> ⎕AI is rather imprecise, even worse than ⎕TS. For performance
> measurements on Intel<br>
> CPUs you should use ⎕FIO ¯1 (return CPU cycle counter) and maybe
> ⎕FIO ¯2 (return CPU frequency).<br>
> ⎕FIO ¯1 is the most precise timing source that you can get in GNU
> APL.<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">Although I could see, on the system monitor, that
> 4 cores were being used, the execution time with n=10000
> actually took longer for the 4 core case, typically 15-20%
> more time than the 1 core case.</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> The expression above that you benchmarked is a mix of parallelized
> and not parallelized APL<br>
> primitives. Each of them is subject to varying execution times, so
> it is difficult to tell if the increased<br>
> execution time is caused by the parallel execution or by the
> anyhow varying execution times.<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">However, I then tried it in a very different
> environment: Ubuntu 18.04.3 again, but running in an emulated
> IBM S/390 mainframe (using the Hercules S/370 emulator running
> in Ubuntu in VMWare on a 3.5 GHz 6-core Xeon). For n=5000,
> this gave the opposite result: the 4 core case was approx. 45%
> quicker.</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> In my experience using all cores of a CPU is not optimal because
> external events from the OS (interrupts<br>
> etc) slow down one of the cores used for APL so that the CPU(s)
> hit by external events increase the<br>
> execution time of each primitive. If you leave one core unused
> (and if you are luck), then the scheduler<br>
> of the OS will see which cores are busy (execution APL) and will
> direct thos events to the unused core.<br>
> <br>
> I also rather doubt that a virtual or emulated environment is able
> to tell anything about parallelized APL.<br>
> By the way there is a nechmarking workspace <b>Scalar3.apl</b>
> shipped with GNU APL that makes benchmarking of parallel GNU APL
> easier. Intel I9 is a good platform for running that workspace,
> but<br>
> avoid any virtualizations and <b>./configure</b> it properly.<br>
> <br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">Directly comparing these two environments (one
> “simply” virtualized, the other emulated and virtualized) is
> not meaningful. It is to be expected that the emulated one
> will be very substantially slower. The more interesting point
> is, perhaps, that on the i9, using more cores actually slows
> it down whereas, in the emulated environment, which is
> effectively a *much* slower processor, using multiple cores
> does yield a modest speed-up.</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> The speedups that can be achieved are generally disappointing. I
> have also compared Intel I7 with intel I9.<br>
> Seems like at the same CPU frequency and with the same core count,
> the I9 uis substantially faster<br>
> than the I7 but at the same time the I7 benefits more from
> parallelization than the I9. Most likely the<br>
> CPU optimizations in the I9 (compared to I7) aim at the same kind
> of parallelism, so that improvements<br>
> of one aspect (CPU architecture) are made at the expense of the
> other aspect (APL parallelization)<br>
> <br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">I am not sure which components of the expression
> (if any) would be parallelized by Gnu APL. So my questions
> are:</div>
> <div class=""><br class="">
> </div>
> <div class="">1. Is it plausible that, on a reasonably modern
> CPU (the i9), using multiple cores would slow down execution
> of this expression?</div>
> </blockquote>
> Could very well be. The expression has a rather small amount of
> parallelization since the majority of<br>
> its primitives is not parallelized.<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">2. Which of the operators in the expression above
> would Gnu APL actually parallelize? <br>
> </div>
> </blockquote>
> Currently all scalar functions and inner and outer products of
> them. One can proove These are the ones <br>
> that in theory and given the GNU APL implementation they must have
> a linear speedup (linear in the<br>
> number of cores). That is, on an I9 a scalar function on 4 cores
> must be 4 times faster than on one<br>
> core. In real life it is only 1.5 or so times faster. This points
> to a hardware bottleneck between the cores<br>
> and the memory. The scalar functions are so lightweight that the
> memory accesses (fetching the operands<br>
> and storing the results) dominate the entire execution time.<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">3. Are there any configuration changes that I
> could make to adjust the way in which parallelization is done?</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> If you mean ./configure options by configurations then no. However
> some ./configure options have<br>
> performance impacts both for parallel and non-parallel execution.
> These should be switched off.<br>
> See README-2-configure for details.<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">One other comment:</div>
> <div class=""><br class="">
> </div>
> <div class="">Before I realised that the svn version is more
> recent, I used the <b style="font-family: -webkit-standard;"
> class="">apl-1.8.tar.gz</b> version of the code that is
> available on the Gnu mirror. This seems to have a minor error
> in Parallel.hh: two occurrences of & in the definition of
> PRINT_LOCKED, which cause a compilation error. They appear to
> have been removed in the svn version.</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> Yes. In the early days of GNU APL I updated the <b>apl-1.X.tar.gz</b>
> files after every bug fix. I was then told<br>
> by the GNU project that this would mess up their mirrors so I
> stopped doing that. Therefore problems in<br>
> 1.8 will only be fixed in 1.9, typically 1-2 years later.<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">Any comments or answers would be appreciated.
> Thank you for taking the time to read my e-mail.</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> You're wecome<br>
> Jürgen<br>
> <blockquote type="cite"
> cite="mid:address@hidden">
> <div class="">Andrew</div>
> <div class=""><br class="">
> </div>
> </blockquote>
> <br>
> </blockquote>
> <br>
> </body>
> </html>
>