bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Experimental OpenMP patch


From: David Lamkins
Subject: Re: [Bug-apl] Experimental OpenMP patch
Date: Tue, 11 Mar 2014 08:27:13 -0700

Thanks, Jürgen.

I'll try to work up some test cases this week.

In my quick scan of the OpenMP document yesterday, I noted that there
are different strategies for assigning work to threads. As with just
about everything else in OpenMP, the strategy is configurable.

My initial thought for putting OpenMP configuration in quad-syl was
that doing so would facilitate benchmarking. Longer term, runtime
access might be useful to experimentally fine-tune APL to platform and
problem.

Given the dynamic nature of OpenMP's configuration, I'm not convinced
that compile-time configuration would yield any benefit. Then again,
I've only skimmed the OpenMP docs...

In any case, the quad-syl changes are nicely localized and easy enough
to back out, should it prove necessary or beneficial in the long run.

I see your point regarding processor affinity. I note that the OpenMP
document actually recommends this for the benefits that might derive
from the per-core data and instruction caches.

On Tue, Mar 11, 2014 at 8:07 AM, Juergen Sauermann
<address@hidden> wrote:
> Hi David,
>
> looks good! Some comments, though.
>
> 1 .you could adapt src/testcases/Performance.pt with some longer
> skalar functions in order to get some performance figures. You can start it
> like this:
>
> ./apl -T testcases/Performance.pt
>
> 2. I believe we should not bother the user with specifying parallelization
> parameters in ⎕SYL.
> I would rather ./configure CORES=n with n=1 meaning no parallel execution,
> CORES=auto
> being the number of cores on the build machine, and explicit numbers n>1
> meaning that
> n cores shall be used. This would generate slightly faster code than
> computing array bounds
> at runtime. Its a bit more hassle for the user, but may pay off soon.
>
> 3. Yes, GNU APL throws many exception (almost every APL error was thrown
> from somewhere),
>  and I was excpecting that we have to catch them on the throwing processor.
> Not too difficult if
> we do it on the top level.
>
> 4. It would be good to understand how the OPenMP loops work. I could
> imagined one of two strategies:
>
> - in loop(j, MAX)   thread j executes iteration j, j+CORES, ...
> - thread j executes iterations j*MAX/CORES ... (j+1)*MAX/CORES
>
> The first strategy interleaves the data and is more intuitive
> while the second uses blocks of data and is more cache-friendly and
> therefore probably
> giving better performance.
>
> 5. Not sure if your earlier comment on letting the scheduler decide is
> correct. I have been doing
> pthread programming in the past and I have seen cases where the scheduler
> fooled itself and
> led to cases where the same problem took more than double the capacity
> compared to explicit
> affinity on a 4-core CPU. I would expect that APL generates very fine-graned
> and short-lived
> pieces of execution and the scheduler may not be optimized for that. I guess
> we have to try that out.
>
> /// Jürgen
>
>
>
>
> On 03/11/2014 08:02 AM, David B. Lamkins wrote:
>>
>> Juergen's suggestion prompted me to attempt an implementation using
>> OpenMP rather than the by-hand coding that I had been anticipating.
>> Attached is a quick-and-dirty patch to enable GNU APL to be build with
>> OpenMP support.
>>
>> ./configure --with-openmp
>>
>> There are many rough edges, both in the Makefile and the code.
>>
>> --with-openmp would ideally check to see whether the compiler supports
>> OpenMP. It may be necessary to check the compiler version, as different
>> compilers support different versions of OpenMP. Also, I've assumed
>> compilation on/for Linux despite the fact that GNU APL and OpenMP should
>> be buildable with the right Windows compiler.
>>
>> As one might expect, OpenMP requires that any throw from a worker thread
>> must be caught by the same thread. I'm almost certain that this
>> restriction could be violated by GNU APL code as currently written.
>>
>> The good news, though, is that the changes are benign; in the absence of
>> --with-openmp, GNU APL's behavior is unchanged.
>>
>> With OpenMP support, ⎕syl is extended to access some of OpenMPs
>> parameters.
>>
>> I've done only trivial testing at this point; just enough to verify that
>> compiling OpenMP support doesn't obviously break GNU APL.
>>
>> I haven't confirmed that the OpenMP #pragmas on the key loops in
>> SkalarFunction.cc have any effect on execution time or processor core
>> utilization. I hope to do more testing later this week.
>>
>> Best wishes,
>>    David
>>
>



-- 
"The secret to creativity is knowing how to hide your sources."
   Albert Einstein


http://soundcloud.com/davidlamkins
http://reverbnation.com/lamkins
http://reverbnation.com/lcw
http://lamkins-guitar.com/
http://lamkins.net/
http://successful-lisp.com/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]