|
From: | 赵睿祺 |
Subject: | [ESPResSo-users] Problem of checkpointing with mpi |
Date: | Mon, 6 May 2019 22:24:03 +0800 (GMT+08:00) |
Dear all,
I have some problems about checkpointing with mpi. What I want to do is to register the system which I set up in the part1.py and load it in the part2.py.
When I run the scripts without mpi, it works well. The command I use is
./pypresso <SCRIPT>
However, when I execute the command with mpi,
mpirun –n 32 ./pypresso <SCRIPT>
something wrong happens:
_______________________________________________________________________________
terminate called after throwing an instance of 'std::out_of_range'
what(): _Map_base::at
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] *** Process received signal ***
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] Signal: Aborted (6)
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] Signal code: (-6)
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fe4d0054390]
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7fe4cfcae428]
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] [ 2] terminate called after throwing an instance of 'std::out_of_range'
……
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22772] *** End of error message ***
x4bec4b]
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22746] *** End of error message ***
x4bec4b]
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22747] *** End of error message ***
x4bec4b]
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22743] *** End of error message ***
x4bec4b]
[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22744] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 4 with PID 22746 on node zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11 exited on signal 6 (Aborted).
How to solve this problem? Thanks so much for your kind help!
Best regards!
Ricky Zhao
part1.py
Description: Binary data
part2.py
Description: Binary data
[Prev in Thread] | Current Thread | [Next in Thread] |