libtool-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fortran libraries on the Blue Gene with mpi


From: Ralf Wildenhues
Subject: Re: Fortran libraries on the Blue Gene with mpi
Date: Sat, 25 Apr 2009 21:15:01 +0200
User-agent: Mutt/1.5.18 (2008-05-17)

Hello Christian, John,

>Ralf Wildenhues wrote:
>> Create six build trees and build and run the Libtool test suites
>> with each of the compiler combinations (the following assumes
>> Bourne-shell syntax):
>>
>>   mkdir build-gcc build-xl build-bgcc build-bgxl build-mpigcc build-mpixl

> please find attached the logs (logs.tar.gz) except for bgcc.

Thank you both for all your efforts.  The message from Christian which I
am quoting, and one from John, didn't make it to the list due to the
size of the logs; I am going to summarize them inline with this reply,
going along with Christian's log files and noting where John's differ
(when applicable).


>>   # GCC, non-BG
>>   cd build-gcc
>>   ../configure CC=gcc CXX=g++ F77=g77 FC=gfortran GCJ=no
>>   make
>>   make -k check VERBOSE=yes 2>&1 | tee checklog-gcc-1
>>   cd ..

With these, all F77 and FC tests in both testsuite failed, as neither
g77 nor gfortran seem to be installed on this system.  All other tests
pass, which is pretty good already.  Nothing left to do here (unless you
want to install these compilers).


>>   # XL, non-BG
>>   cd build-xl
>>   ../configure CC=xlc CXX=xlC F77=xlf FC=xlf95 GCJ=no
>>   make
>>   make -k check VERBOSE=yes 2>&1 | tee checklog-xl-1
>>   cd ..

All tests passed, both testsuites.  Yay!

However, as a minor note, the logs all show:

| checking dependency style of xlc... none
[...]
| checking dependency style of xlC... none

which is kind of weird.  IIRC the XL compilers have working dependency
extraction mechanisms, which are detected by the Automake code.  There
has been one bug fix in Automake's depcomp script, but it was limited to
the --disable-static case.  It would be worthwhile to investigate this.

xlc and xlC understand -qpic and -qstaticlink, good.
xlf only -qpic, but not -qstaticlink, oh well.


>>   # GCC, BG
>>   cd build-bgcc
>>   ../configure CC=bgcc CXX=bgc++ F77=bgf77 FC=bgfortran GCJ=no \
>>                LDFLAGS=-dynamic
>>   make
>>   make -k check VERBOSE=yes 2>&1 | tee checklog-bgcc-1
>>   cd ..
>
>configure failed:
>## ---------------------------------------------------- ##
>## Configuring libtool (Build:1.3089 2009-03-29) 2.2.7a ##
>## ---------------------------------------------------- ##
>
>checking for a BSD-compatible install... /usr/bin/install -c
>checking whether build environment is sane... yes
>checking for a thread-safe mkdir -p... /bin/mkdir -p
>checking for gawk... gawk
>checking whether make sets $(MAKE)... yes
>checking whether subdir libobjs are useable... yes
>checking for gcc... bgcc
>checking for C compiler default output file name...
>configure: error: in
>`/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-bgcc':
>configure: error: C compiler cannot create executables
>See `config.log' for more details.
>
>
>bgcc and bgcc_r are wrappers for pre-ANSI C. Don't know if it's worth
>the effort to support this.

Ah, I didn't know this.  You could look into the build-bgcc/config.log
file to find out the specific reason why this failed.  But if these
compilers aren't important anyway, then I won't mind if we ignore them.


>>   # XL, BG
>>   cd build-bgxl
>>   ../configure CC=bgxlc CXX=bgxlC F77=bgfort FC=bgxlf95 GCJ=no \
>>                LDFLAGS=-qnostaticlink
>>   make
>>   make -k check VERBOSE=yes 2>&1 | tee checklog-bgxl-1
>>   cd ..

This is where things start to get interesting.

bgxlc and bgxlC understand -qpic and -qstaticlink, good.
bgxlf95 understands -qpic but not -qstaticlink; it however accepts
the -qnostaticlink flag.

Test failures:

- f77demo-* in the old testsuite
  This is because the bgfort command does not exist.
  It was a typo, should have been F77=bgfort77 or F77=bgf77 or F77=bgxlf
  I guess.  If you have energy left, here's how you can rerun those
  tests:

   cd build-bgxl
   ../configure CC=bgxlc CXX=bgxlC F77=bgfort77 FC=bgxlf95 GCJ=no \
                LDFLAGS=-qnostaticlink
   gmake
   gmake -k check VERBOSE=yes TESTSUITEFLAGS='-k F77' TESTS="\
        tests/f77demo-static.test \
        tests/f77demo-make.test \
        tests/f77demo-exec.test \
        tests/f77demo-conf.test \
        tests/f77demo-make.test \
        tests/f77demo-exec.test \
        tests/f77demo-shared.test \
        tests/f77demo-make.test \
        tests/f77demo-exec.test"


- fcdemo-exec fails after fcdemo-static:

| bgxlf95  -g -c -o fprogram.o  
/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/tests/fcdemo/fprogram.f90
| ** fprogram   === End of Compilation 1 ===
| 1501-510  Compilation successful for file fprogram.f90.
| /bin/sh ./libtool   --mode=link bgxlf95  -g  -qnostaticlink -o fprogram 
fprogram.o libfoo.la libfoo3.la -ldl 
| libtool: link: bgxlf95 -g -qnostaticlink -o fprogram fprogram.o  
./.libs/libfoo.a 
/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-bgxl/tests/fcdemo/.libs/libfoo2.a
 ./.libs/libfoo3.a -ldl
[...]
| PASS: tests/fcdemo-make.test
| fcdemo-exec.test: ===  Running fcdemo-exec.test
| fcdemo-exec.test: ===  Executing uninstalled programs in build-bgxl
| tests/defs: line 1132:  9158 Illegal instruction     tests/fcdemo/fprogram
| fcdemo-exec.test: ../tests/fcdemo-exec.test: cannot execute 
tests/fcdemo/fprogram 
| fcdemo-exec.test: ===  You may need to run ../tests/fcdemo-exec.test as the 
superuser.
|  fsub called
|  fsubf called
| Welcome to GNU libtool mixed C/Fortran demo!
| The C subroutine returned, claiming that 2*2 = 4
| The C subroutine is ok!
| 
| Calling the C wrapper routine...
| Calling the Fortran subroutine from the C wrapper...
| Returned from the Fortran subroutine...
| The C wrapper to the fortran subroutine returned,
| claiming that 2*2 = 4
| The Fortran subroutine is ok!
| FAIL: tests/fcdemo-exec.test

This can indicate a bug in the compiler or linker.  Or maybe just that
-qnostaticlink should not be used in conjunction with --disable-shared.
Or a bug in the _MAIN detection Autoconf macros.  Dunno.


- fcdemo-make fails after fcdemo-conf and after fcdemo-shared:

| /bin/sh ./libtool   --mode=link bgxlf95  -g  -qnostaticlink -o fprogram 
fprogram.o libfoo.la libfoo3.la -ldl 
| libtool: link: 
LD_RUN_PATH="/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-bgxl/_inst/lib:"
 bgxlf95 -g -qnostaticlink -o .libs/fprogram fprogram.o  ./.libs/libfoo.so 
/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-bgxl/tests/fcdemo/.libs/libfoo2.so
 ./.libs/libfoo3.so -ldl
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: attempted static 
link of dynamic object `./.libs/libfoo.so'
| gmake[5]: *** [fprogram] Error 1

(the cprogram link succeeds)

I can't make heads of this yet.  Maybe bgxlf95 doesn't understand
-qnostaticlink after all, and needs some other flag, or cannot link
against shared libraries at all?


- In the new testsuite, the F77 tests of course fail due to nonexistent
  bgfort, too.

- and FC tests fail, for similar reasons as above:

| ../../tests/convenience.at:219: $LIBTOOL --tag=FC --mode=link $FC $FCFLAGS 
$LDFLAGS -static -o main_static$EXEEXT main$i.lo liba$conv.la
| stderr:
| stdout:
| libtool: link: bgxlf95 -g -qnostaticlink -o main_static .libs/main2.o  
./.libs/liba12.a 
| ../../tests/convenience.at:221: $LIBTOOL --tag=FC --mode=link $FC $FCFLAGS 
$LDFLAGS -o main$EXEEXT main$i.lo liba$conv.la
| stderr:
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: attempted static 
link of dynamic object `./.libs/liba12.so'
| stdout:
| libtool: link: LD_RUN_PATH="/notexist:" bgxlf95 -g -qnostaticlink -o 
.libs/main .libs/main2.o  ./.libs/liba12.so 
| ../../tests/convenience.at:221: exit code was 1, expected 0
| 24. convenience.at:169: 24. FC convenience archives (convenience.at:169): 
FAILED (convenience.at:221)

- Also, the sys_lib_search_path test fails, for the simple reason that
  there is no libz installed for this setup:

| 34. search-path.at:25: testing ...
| libtool: link: bgxlc -g -qnostaticlink -o main main.o  -L/lib -lz
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: cannot find -lz
| libtool: link: bgxlc -g -qnostaticlink -o main main.o  -L/usr/lib -lz
| ../../tests/search-path.at:48: $LIBTOOL --mode=link $CC $CFLAGS $LDFLAGS -o 
main$EXEEXT main.$OBJEXT -lz
| stderr:
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: cannot find -lz
| stdout:
| libtool: link: bgxlc -g -qnostaticlink -o main main.o  -lz
| ../../tests/search-path.at:48: exit code was 1, expected 0
| 34. search-path.at:25: 34. sys_lib_search_path (search-path.at:25): FAILED 
(search-path.at:48)

This testsuite bug has been fixed after the 2.2.6a release only.


Also, bgxlc and bgxlC both get depmode none again, similar to above.

Also, the config.log file shows this interesting bit:

| configure:11333: checking whether a program can dlopen itself
| configure:11403: bgxlc -o conftest -g  -DHAVE_DLFCN_H -qnostaticlink 
-Wl,--export-dynamic conftest.c -ldl  >&5
| configure:11406: $? = 0
| configure:11424: result: yes
| configure:11429: checking whether a statically linked program can dlopen 
itself
| configure:11499: bgxlc -o conftest -g  -DHAVE_DLFCN_H -qnostaticlink 
-Wl,--export-dynamic -qstaticlink conftest.c -ldl  >&5
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: Dwarf Error: 
mangled line number section (bad file number).
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: Dwarf Error: 
mangled line number section (bad file number).
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: Dwarf Error: 
mangled line number section (bad file number).
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: Dwarf Error: 
mangled line number section (bad file number).
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: Dwarf Error: 
mangled line number section (bad file number).
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: Dwarf Error: 
mangled line number section (bad file number).
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: Dwarf Error: 
mangled line number section (bad file number).
| conftest.o: In function `main':
| 
/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-bgxl/configure:11483:
 warning: Using 'dlopen' in statically linked applications requires at runtime 
the shared libraries from the glibc version used for linking

The Dwarf errors message look like work to do on the binutils side.  :-)

Right after that:

| configure:11502: $? = 0
| /lib/: cannot read file data: Is a directory
| configure:11520: result: no
| configure:11559: checking whether stripping libraries is possible
| configure:11564: result: yes

The /lib/  looks pretty weird.  I don't yet understand where it comes from
but could be a bug in _LT_TRY_DLOPEN_SELF or LT_SYS_DLOPEN_SELF.
We should try to analyse and fix it.

Can you do something like this and post the configure standard output
and standard error?

   cd build-bgxl
   sed '/checking whether a statically linked program can/a\
        set -x
        /result.*lt_cv_dlopen_self_static/a\
        set +x' < ../configure > ../configure-debug
   ../configure-debug CC=bgxlc CXX=bgxlC F77=bgfort77 FC=bgxlf95 GCJ=no \
                LDFLAGS=-qnostaticlink



>>   # GCC, MPI
>>   cd build-mpigcc
>>   ../configure CC=mpicc CXX=mpicxx F77=mpif77 FC=mpif90 GCJ=no \
>>                LDFLAGS=-dynamic
>>   make
>>   make -k check VERBOSE=yes 2>&1 | tee checklog-mpigcc-1
>>   cd ..

More failures here:

- tagdemo-exec.test after tagdemo-{static,conf,shared,undef}.test:

| tagdemo-exec.test: ===  Running tagdemo-exec.test
| tagdemo-exec.test: ===  Executing uninstalled programs in build-mpigcc
| tests/defs: line 1132:  1355 Illegal instruction     tests/tagdemo/tagdemo
| tagdemo-exec.test: ../tests/tagdemo-exec.test: cannot execute 
tests/tagdemo/tagdemo 

Ouch.  Maybe we should try again without LDFLAGS=-dynamic.
Or we need to pass -dynamic through to the shared library creation.
Anyway, this is a bit disappointing, as it means we don't have a way to
produce working C++ executables and libraries yet.

Are C++ programs and libraries working ok with these compilers in
general?


- f77demo-*.test: Fortran compiler mpif77 doesn't work, due to:

| 
/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/libexec/gcc/powerpc-bgp-linux/4.1.2/f951:
 error while loading shared libraries: libmpfr.so.1: cannot open shared object 
file: No such file or directory

- fcdemo-*.test: Fortran compiler name mpif90 doesn't work, due to:

| 
/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/libexec/gcc/powerpc-bgp-linux/4.1.2/f951:
 error while loading shared libraries: libmpfr.so.1: cannot open shared object 
file: No such file or directory

- In the new testsuite, all C++, Fortran 77/90 tests failed too,
  consequently.

Can you do the following to rerun those tests?
Find the directory where that libmpfr.so.1 is installed.  Say, it is
in $foodir.  Then

   LD_LIBRARY_PATH=$foodir${LD_LIBRARY_PATH+:}$LD_LIBRARY_PATH
   export LD_LIBRARY_PATH
   cd build-mpigcc
   ../configure CC=mpicc CXX=mpicxx F77=mpif77 FC=mpif90 GCJ=no \
                LDFLAGS=-dynamic
   gmake
   gmake -k check VERBOSE=yes TESTSUITEFLAGS='-k F77 -k FC' TESTS="\
        tests/f77demo-static.test \
        tests/f77demo-static-make.test \
        tests/f77demo-static-exec.test \
        tests/f77demo-conf.test \
        tests/f77demo-conf-make.test \
        tests/f77demo-conf-exec.test \
        tests/f77demo-shared.test \
        tests/f77demo-shared-make.test \
        tests/f77demo-shared-exec.test \
        tests/fcdemo-static.test \
        tests/fcdemo-static-make.test \
        tests/fcdemo-static-exec.test \
        tests/fcdemo-conf.test \
        tests/fcdemo-conf-make.test \
        tests/fcdemo-conf-exec.test \
        tests/fcdemo-shared.test \
        tests/fcdemo-shared-make.test \
        tests/fcdemo-shared-exec.test"


For a nicer user experience, it would be helpful if those compilers were
rebuilt with -Wl,-rpath,$foodir in their LDFLAGS (maybe you can ask your
software providers).

In contrast, on the system John tested, the mpif77 and mpif90 drivers
work.  There, we see the following failures:

  - f77demo-exec.test after f77demo-{static,conf,shared}.test:

    | mpicc -DHAVE_CONFIG_H -I. -I/home/cary/libtooling/libtool/tests/fcdemo  
-I/home/cary/libtooling/libtool/tests/fcdemo/../..   -g -O2 -c 
/home/cary/libtooling/libtool/tests/fcdemo/cprogram.c
    | /bin/sh ./libtool --tag=CC   --mode=link mpicc  -g -O2  -dynamic -o 
cprogram cprogram.o libmix.la 
-L/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/lib/gcc/powerpc-bgp-linux/4.1.2
 
-L/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/lib/gcc/powerpc-bgp-linux/4.1.2/../../../../powerpc-bgp-linux/lib
 -ldl -lgfortranbegin -lgfortran -lm -ldl 
    | libtool: link: mpicc -g -O2 -dynamic -o cprogram cprogram.o  
./.libs/libmix.a 
-L/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/lib/gcc/powerpc-bgp-linux/4.1.2
 
-L/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/lib/gcc/powerpc-bgp-linux/4.1.2/../../../../powerpc-bgp-linux/lib
 
/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/powerpc-bgp-linux/lib/libgfortranbegin.a
 
/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/powerpc-bgp-linux/lib/libgfortran.so
 -lm -ldl -Wl,-rpath 
-Wl,/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/powerpc-bgp-linux/lib 
-Wl,-rpath 
-Wl,/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/powerpc-bgp-linux/lib
    [...]
    | f77demo-exec.test: ===  Running f77demo-exec.test
    | f77demo-exec.test: ===  Executing uninstalled programs in build-mpigcc
    |  Welcome to GNU libtool Fortran demo!
    |  Real programmers write in FORTRAN.
    |  fsub called
    |  fsubf called
    |  fsub returned, saying that 2 *           2  =           4
    |  fsub is ok!
    |  fsub3 called
    |  fsub3 returned, saying that 4 *           2  =           8
    |  fsub3 is ok!
    | tests/defs: line 1132: 23888 Illegal instruction     
tests/f77demo/cprogram
    | f77demo-exec.test: ../tests/f77demo-exec.test: cannot execute 
tests/f77demo/cprogram 

  - likewise for fcdemo-exec.test after fcdemo-{static,conf,shared}.test

  Again, I don't know what this is about yet.


Back to Christian's logs.

The config.log file shows that the mpicc and mpicxx drivers might have a
typo in their setup or specs somewhere, a space replaced by a hyphen:

| configure:2955: mpicc -V >&5
| powerpc-bgp-linux-gcc: couldn't run 
'/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/bin/powerpc-bgp-linux-gcc--I/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/comm/default/include':
 No such file or directory
| configure:2959: $? = 1

| configure:13656: mpicxx -V >&5
| powerpc-bgp-linux-g++: couldn't run 
'/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/gnu-linux/bin/powerpc-bgp-linux-gcc--I/bgsys/drivers/V1R3M0_460_2008-081112P/ppc/comm/default/include':
 No such file or directory
| configure:13660: $? = 1

Reporting this to your system administrators could be a good idea,
it should be easily fixed, once the relevant spec file is found.


I can't make heads of the mpif77/mpif90 response to this:

| configure:17326: mpif77 -V >&5
| powerpc-bgp-linux-gfortran: '-V' must come at the start of the command line
| configure:17330: $? = 1

Probably it uses -V for some other meaning; anyway, --version works
which is sufficient.


The mpicc and mpicxx drivers get depmode gcc3, yay!


>>   # XL, MPI
>>   cd build-mpixl
>>   ../configure CC=mpixlc CXX=mpixlC F77=mpixlf FC=mpixlf95 GCJ=no \
>>                LDFLAGS=-qnostaticlink
>>   make
>>   make -k check VERBOSE=yes 2>&1 | tee checklog-mpixl-1
>>   cd ..

The CXX=mpixlC was wrong, causing all the tagdemo tests to fail.
Dunno what the right name would have been.

The F77=mpixlf was wrong, too, causing all the f77demo tests to fail.

The fcdemo-exec.test fails after fcdemo-static.test like this:

| fcdemo-exec.test: ===  Executing uninstalled programs in build-mpixl
| tests/defs: line 1132: 16935 Illegal instruction     tests/fcdemo/fprogram
| fcdemo-exec.test: ../tests/fcdemo-exec.test: cannot execute 
tests/fcdemo/fprogram 

fcdemo-make.test fails after fcdemo-{conf,shared}.test:

| /bin/sh ./libtool   --mode=link mpixlf95  -g  -qnostaticlink -o fprogram 
fprogram.o libfoo.la libfoo3.la -ldl 
| libtool: link: 
LD_RUN_PATH="/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-mpixl/_inst/lib:"
 mpixlf95 -g -qnostaticlink -o .libs/fprogram fprogram.o  ./.libs/libfoo.so 
/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-mpixl/tests/fcdemo/.libs/libfoo2.so
 ./.libs/libfoo3.so -ldl
| libtool: link: 
LD_RUN_PATH="/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-mpixl/_inst/lib:"
 mpixlf95 -g -qnostaticlink -o .libs/fprogram fprogram.o  ./.libs/libfoo.so 
/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-mpixl/tests/fcdemo/.libs/libfoo2.so
 ./.libs/libfoo3.so -ldl
| /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: attempted static 
link of dynamic object `./.libs/libfoo.so'
| gmake[5]: *** [fprogram] Error 1

The new testsuite fails for similar reasons as the old, in C++ and
Fortran tests, and shows missing libz again.

The config.log file again shows the Dwarf Error and the
| /lib/: cannot read file data: Is a directory

issue.

Cheers,
Ralf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]