[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: compile MPITB, octave 2.1.69

From: Michael Creel
Subject: Re: compile MPITB, octave 2.1.69
Date: Fri, 01 Apr 2005 17:59:04 +0200
User-agent: KMail/1.7.2

Thanks Javier, with this info I think I'll be able to get somewhere. Your 
original Makevile.env didn't work on Debian, and it was Thomas who found that 
adding mpi++, etc., seemed to solve the problem. The funny thing is that I 
could compile with Octave 2.1.67, but not 2.1.69. However, the stuff you 
write below makes me think that something else may be the problem. With 
respect to calculations varying slightly depending upon nodes, I'm glad to 
hear that this is expected. The BFGS minimization problem has parallel and 
serial sections interspersed, and seems to amplify this problem. It's not a 
big deal, the problem converges to the same solution, but the exact number of 
iterations to reach convergence can vary. There is a good speedup, though.

Thanks, M.

On Friday 01 April 2005 17:43, Javier Fernandez Baldomero wrote:
> Michael Creel wrote:
> >to missing symbol errors with 2.1.69. Just in case anyone knows what
> > changes might have provoked this and how to fix the problem. The error
> > follows. ...
> >MPITB extensions found
> >octave:1> kernel_example1
> >error: /home/mcreel/mpi_work/mpitb/DLD/MPI_Initialized.oct: undefined
> > symbol: _ZN4PMPI4Comm12mpi_comm_mapE
> >error: `MPI_Initialized' undefined near line 10 column 17
> >error: called from `LAM_Init' in file
> >`/usr/local/share/octave/site-m/mpitb_utils/LAM_Init.m'
> Hi,
> WRT the error message:
> LAM_Init.m is an M-file that contains near the beginning:
>  > ...
>  >      [infI flgI]=MPI_Initialized;               % Init?
>  >      [infF flgF]=MPI_Finalized;                 % Finalize?
>  >      if infI ||  infF
>  > ...
> that accounts for the last 3 lines of the error.
> Then, Octave finds MPI_Initialized.oct in the DLD subdir,
> but apparently your re-compiled version (oddly) requires some _ZN4...
> symbol
> I'm not sure what might have caused that problem.
> I see you added -lmpich and -lmpi++ to the library list...
> With the original library list, the symbol MPI_Initialized comes from
> and no other MPI library is required... This is the situation in my system:
> ____________________________________________
> $ ls -la MPI_Initialized.oct
> -rwx------  1 javier javier 18383 may  5  2004 MPI_Initialized.oct
> $ ldd !$
> ldd MPI_Initialized.oct
> =>  (0xffffe000)
> => /home/javier/lam-7.1.1/lib/ (0xb7fa5000)
> => /home/javier/lam-7.1.1/lib/ (0xb7f5f000)
> => /lib/ (0xb7f49000)
> => /lib/tls/i686/ (0xb7e39000)
> => /lib/tls/i686/ (0xb7e28000)
> => /lib/ (0xb7e24000)
>         /lib/ (0x80000000)
> $ nm !$ | grep MPI
> nm MPI_Initialized.oct | grep MPI
> 000018e4 T FSMPI_Initialized_gnu_v3
> 00001d40 t _GLOBAL__I_FSMPI_Initialized_gnu_v3
>          U MPI_Initialized
> 00001b2e T _Z16FMPI_InitializedRK17octave_value_listi
> $ nm MPI_Initialized.oct | grep U
>          U __cxa_atexit@@GLIBC_2.1.3
>          U error_state
>          U __gxx_personality_v0
>          U MPI_Initialized
> ...
>          U _ZNSt8ios_base4InitC1Ev
>          U _ZNSt8ios_base4InitD1Ev
>          U _Znwj
> ____________________________________________
> So, using nm I can tell which undefined symbols my .oct file relies on,
> and using ldd I can tell which library is planning to get them from.
> If I double-check for MPI_Init in my ldd list:
> ____________________________________________
> $ cd $LAMHOME/lib
> $ pwd
> /home/javier/lam-7.1.1/lib
> $ ls
> lam
> $ nm | grep Init
> 000173fc T MPI_Init
> 00017490 T MPI_Initialized
> 000174b8 T MPI_Init_thread
> 00045234 T PMPI_Init
> 000452c8 T PMPI_Initialized
> 000452f0 T PMPI_Init_thread
> $ nm | grep map
> 00052db8 b cid_map
> 00052db4 b empty_map
> 0003b66c T lam_ptmalloc2_munmap
> 00052bb8 d map_size
>          U mmap@@GLIBC_2.0
> 0000fb58 T MPI_Cart_map
> 000151b0 T MPI_Graph_map
>          U munmap@@GLIBC_2.0
> 0003d9f0 T PMPI_Cart_map
> 00042fe8 T PMPI_Graph_map
> ____________________________________________
> So, my does not contain any _comm_map symbol.
> Since the offending symbol name was:
> octave:1> kernel_example1
> error: /home/mcreel/mpi_work/mpitb/DLD/MPI_Initialized.oct: undefined
> symbol: _ZN4PMPI4Comm12mpi_comm_mapE
> error: `MPI_Initialized' undefined near line 10 column 17
> error: called from `LAM_Init' in file
> `/usr/local/share/octave/site-m/mpitb_utils/LAM_Init.m'
> I deduce that the _ZN4... symbol comes from the .oct file.
> I mean, it's not a problem of the LAM library missing any symbol.
> In the compile step you have inadvertently added a dependency on
> that symbol -- symbol that I don't know, I'm sure it's not part of the
> LAM/MPI. I knew MPI_Cart_map and MPI_Graph_map, but no
> MPI_Comm::mpi_comm_map.
> Sounds like some kind of C++ binding. Are you sure you have linked
> against liblam/libmpi ?!? Perhaps those are missing in your system,
> and since you added -lmpich and -lmpi++, the offending symbol
> comes from there.
> Look for _comm_map in libmpich and libmpi++ using nm
> or some other similar tool just to be certain of the diagnostic,
> but in any case, to get MPITB working, you'll probably need
> liblam/libmpi (I cannot help porting MPITB to mpich/mpi++)
> Make me know if the symbol came from libmpi++ and if
> you manage to recompile against LAM libraries (removing
> both -lmpi and -lmpi++ from the library list in MPICLIBS)
> My original version was:
>  > MPICLIBS    = -L$(LAMHOME)/lib -lmpi -llam -lutil
> _____________________________________________________
> > I'm getting different results when doing what are in principle the same
> > calculations using Octave serially and in parallel, using the MPITB
> > toolkit.
> I remember the Pi demo in MPITB shows an in principle similar behaviour.
> It integrates arctan' to compute pi. The integration is done round-robin
> in the sense that the rectangles are indexed, and for N slaves 0..N-1,
> the 0-th slave computes and sums the areas of rectangles i==0,N,2N...
> 1-st slave the areas of rectangles such that (i mod N)=1 and so on.
> Depending on the number of slaves used, the computation is slightly
> different, and all those computations are different from the sequential
> computation.
> Of course, with the sequential computation, the whole sum ends up in the
> same variable, so the last rectangles are very small compared to the
> accumulated
> value (near to pi). When you use 10 computers, each one accumulates a value
> close to pi/10, so the last rectangles are one order of magnitude better
> (for rounding error purposes) that in the sequential version.
> Perhaps your problem is related to this one (rounding errors) or perhaps
> not.
> For the Pi example, the differences were unavoidable (I must sum the areas
> that way if I'm expected to distribute the computations) and acceptable
> (errors only one-two orders of magnitude above double-precision resolution,
> ie: in the 14th-15th significant digit). Or was that single-precision?
> Oh, my, can't remember ;-)
> -javier
> >------------------------------------------------------------------------
> >
> ># Makefile for MPITB on Debian unstable
> >...
> >WHEREARELIBS     := $(shell octave-config -p OCTLIBDIR)
> >...
> >
> >MPICPPFLAGS = -I/usr/include/lam
> >
> ># MPICLIBS seems to be necessary to avoid missing symbol errors
> ># Both of the following work, whether or not mpich is installed, and
> ># in spite of the fact that they make reference to files and/or
> ># directories that may not exist.
> ># MPICLIBS   = -L/usr/lib/mpich/lib/shared -llam -lutil -lmpich -lmpi++
> >MPICLIBS    = -L/usr/include/lam -llam -lmpi++ -lutil
> >...

Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:
How to fund new projects:
Subscription information:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]