freeon-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeon-users] Ktrax-preprocessed.f with --enable-parallel-clones


From: Matt Challacombe
Subject: Re: [Freeon-users] Ktrax-preprocessed.f with --enable-parallel-clones
Date: Wed, 19 Oct 2011 22:02:43 -0600

ktrax was meant to be re-written immediately,  it was there as proof of concept for
the parallel ONX, then Valery left.  The previous code (Eric's) was 5-10x faster in
serial, and is proof of the adage that there is nothing so permanent as a temporary
solution.  After SpAMM, exchange is my #1 priority....

-M

On Wed, Oct 19, 2011 at 9:57 PM, Nicolas Bock <address@hidden> wrote:
I have never seen the auto vectorizer explained like this, but from now on I will always picture lots of donuts when I use it :)

I suspect KTrax was generated by mathematica and I am sure could be quite improved. I am not all too familiar with mathematica myself, but maybe it has some options to control how it generates code? Less unrolled, perhaps?

nick


On Wed, Oct 19, 2011 at 19:25, Jeff Hammond <address@hidden> wrote:
It is not surprising at all that those flags cause the compiler to
explode.  Compiling KTrax_7_6 with "-msse -msse2 -mfpmath=sse
-ffast-math -ftree-vectorize" is cruel and unusual punishment.  That
subroutine has a single loop that is more than 10000 lines long and is
completely vectorizable.

The compiler looks at it like a fat kid looks at a donut, but then the
donut turns out to be made of dark matter and weighs 10000 pounds.
One bite and this fat kid is rolling on the flooring crying for his
mommy, which is to say, the compiler segfaults when it tries to
examine the possible vectorizations in this loop (or one like it).

It is prudent to de-unroll KTrax_7_6 into someone more manageable, if
a compiler is to have any hope of vectorizing it.  Alternatively, one
could hand-vectorize with SSE and related nonportable intrinsics and
free the compiler of that task.

If you find a compiler that auto-vectorizes KTrax_7_6, you have found
a compiler that does not try very hard to vectorize :-)

Jeff

On Wed, Oct 19, 2011 at 8:06 PM, Nicolas Bock <address@hidden> wrote:
> Hi Jose,
> let me add to what Matt said, that we are seeing this compiler problem here
> as well. We usually work around it the same way you did, by turning off all
> compiler optimizations (-O0). I had put something along those lines into the
> makefiles, but it sounds like that this work around is broken right now. I
> will have a look at the makefiles and see why.
> Please let us know if you run into any other problems.
> nick
>
> On Wed, Oct 19, 2011 at 16:44, Matt Challacombe <address@hidden>
> wrote:
>>
>> Hi Jose,
>>
>> You've been busy ...  and done well.  Briefly, FreeON should be pretty
>> stable
>> with MPI for parallel clones -- that is MPI enabled NEB band
>> calculations.
>> Generalized MPI, that is within a single configuration or clone is broken
>> though, and
>> will not be supported in this current instantiation.  We are in process of
>> completely
>> re-writing many of the algorithms in FreeON and their parallel
>> implementation rather
>> than trying to maintain outdated paradigms.
>>
>> However, we are using the current version to do biochemical calculations
>> on the 300-500
>> atom level (NEB), and it is quite stable at this level of parallelism,
>> giving what I consider
>> a reasonable turn around at the "good" level of accuracy (about 7 digits
>> in total energy).
>> If that seems like the kind of capability you are looking for, we are more
>> than
>> happy to try to support your efforts.  Maybe you've sent us the info in
>> your e-mail, but if
>> you could let us know in future, what is your gcc, kernel version # etc.
>>
>> All the best, Matt and Nick
>>
>> PS.  Also feel free to write back to me at this e-mail.
>>
>> On Wed, Oct 19, 2011 at 2:15 AM, Jose R. Valverde <address@hidden>
>> wrote:
>>>
>>> Hi all,
>>>
>>>        I am trying to compile freeON with (some) MPI support. I know.
>>>
>>>        Anyway, when I use --enable-parallel-clones it breaks compiling
>>> Ktrax-preprocessed.f, which being just a huge list of trivial assignment
>>> statements it shouldn't. The reason given is that the compiler runs out
>>> of virtual memory. This is likely unrelated to freeON, as this machine
>>> has 72GB RAM and when the compiler breaks it has allocated only about
>>> 2.9Gb of them.
>>>
>>>        Now, this looks in all respects as a problem with the optimizer,
>>> and indeed, if I compile the file manually turning off all specific
>>> optimization options, it compiles OK and the build proceeds:
>>>
>>>         mpif90 -g -O2 -march=native  -I. -I../Modules -I. -I..
>>>         -I../OneE -I../Modules/MMA/LookUpTables_800_6x   -c -o KTrax.o
>>>         KTrax-preprocessed.f
>>>
>>> instead of
>>>
>>>        mpif90 -g -O2 -march=native -msse -msse2 -mfpmath=sse
>>>        -ffast-math -ftree-vectorize -pipe -ffixed-line-length-none -I.
>>>        -I../Modules -I. -I.. -I../OneE
>>>        -I../Modules/MMA/LookUpTables_800_6x   -c -o KTrax.o
>>>        KTrax-preprocessed.f
>>>
>>> I haven't tried many combinations to see which is the exact compiler
>>> flag causing this behavior, but I guess while this may be a bit slower,
>>> it will still work and be not too bad (-O2 remains).
>>>
>>>        Well, the compilation is going on now, and once it is finished
>>> I'll run the validation checks and let you know, but in principle it
>>> should pass.
>>>
>>>                                j
>>>
>>> --
>>>                        EMBnet/CNB
>>>                Scientific Computing Service
>>>        Solving all your computer needs for Scientific
>>>                        Research.
>>>
>>>                http://bioportal.cnb.csic.es
>>>                  http://www.es.embnet.org
>>>
>>
>
>



--
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
address@hidden / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/index.php/User:Jhammond




reply via email to

[Prev in Thread] Current Thread [Next in Thread]