espressomd-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ESPResSo-devel] [bug #37281] cannot read blockfile with mpirun


From: Axel Arnold
Subject: Re: [ESPResSo-devel] [bug #37281] cannot read blockfile with mpirun
Date: Thu, 06 Sep 2012 16:23:51 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120713 Thunderbird/14.0

Hi,

there is a global variable blockfile_variable_blacklist, where you can specify which variables to ignore when writing, see the User's guide. Analogously, there is a whitelist of variables to read.

However, I would strongly recommend to write out only those variables that are really necessary to restore the script's state (blockfile $out write tclvariables {a b c}. If loop variables and others change later, e.g. because you extended your script, it will tend to fail otherwise.

Regards,
Axel

On 09/06/2012 12:06 PM, Stefan Kesselheim wrote:
Dear Martin,
thanks for submitting this problem.
It is not a bug of Espresso (see below) and in these cases, or if you are unsure, please send a mail to the espresso users mailing list: address@hidden.
This list is read by more people, and often other users can help out.
For your problem: You are using the variable name "in" as input file channel. In the blockfile, there is a variable "in" defined. When reading this variable, you destroy the input channel.
It's a bit tricky, but you will have to make sure that there are no variable name collisions.
I'm no sure how you found the diagnosis that the problem being related to MPI, because in principle MPI and blockfiles are fully orthogonal: The blockfile contains information interpreted by the TCL interpreter and this is only done in the process with rank 0, and all its content is then interpreted on the TCL level serially.

Finally: A smaller blockfile (that is not compressed) would have saved time in finding the reason :-). I merely stumbled over it by a similar name collision trying to reproduce you problem.

Good luck and cheers
Stefan

On 09/06/2012 11:18 AM, Martin Linden wrote:
URL:
  <http://savannah.nongnu.org/bugs/?37281>

                 Summary: cannot read blockfile with mpirun
                 Project: ESPResSo
            Submitted by: bmelinden_dbb
            Submitted on: Thu 06 Sep 2012 09:18:34 AM GMT
                Category: Simulation core
                Severity: 3 - Normal
                  Status: None
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 3.1.0
           Fixed Release: None

    _______________________________________________________

Details:

I have trouble reading in blockfiles when using mpi. 
I would say this is a pretty serious problem, because it means that I can only
restart crashed simulations on a single processor.

The attached script and gzip archive demonstrates the problem. Without MPI,
Espresso reads the block file and starts integrating:

$ Espresso blockread.tcl

With mpi, the fine is somehow not opened, or not recognized:

$ mpirun -n 4 Espresso blockread.tcl

(...)

can not find channel named "file14"
    while executing
"blockfile $in read auto"
    invoked from within
"while { [blockfile $in read auto] != "eof" } {}"
    (file "blockread.tcl" line 6)
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

The block file in question was in fact created using mpi, so writing blocks
seems to work.

System and version:
ubuntu 12-04 64 bit, espresso 3.1.0, code_info: { Compilation status { FFTW }
{ BOND_ANGLE_HARMONIC } { LENNARD_JONES } { LJCOS } { LJCOS2 } { EXCLUSIONS }
}

Best,

Martin Lindén, Stockholm University




    _______________________________________________________

File Attachments:


-------------------------------------------------------
Date: Thu 06 Sep 2012 09:18:34 AM GMT  Name: blockread.tcl  Size: 1kB   By:
bmelinden_dbb

<http://savannah.nongnu.org/bugs/download.php?file_id=26486>
-------------------------------------------------------
Date: Thu 06 Sep 2012 09:18:34 AM GMT  Name: checkpoint.gz  Size: 170kB   By:
bmelinden_dbb

<http://savannah.nongnu.org/bugs/download.php?file_id=26487>

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?37281>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

   


-- 
JP Dr. Axel Arnold
ICP, Universität Stuttgart
Pfaffenwaldring 27
70569 Stuttgart, Germany
Email: address@hidden
Tel: +49 711 685 67609

reply via email to

[Prev in Thread] Current Thread [Next in Thread]