|
From: | Georg Rempfer |
Subject: | Re: [ESPResSo-users] Cuda Memory Error |
Date: | Tue, 22 Mar 2016 12:54:13 +0100 |
The problem occurs the first time the line is executed. Thank’s for looking it up!
Von: address@hidden [mailto:address@hidden] Im Auftrag von Georg Rempfer
Gesendet: Dienstag, 22. März 2016 12:04
Is this line executed the first time when the problem happens? In that case your memory is actually too small (I'll look at the malloc in a second to see how much is needed). Or has this line worked once or several time already? In that case there is a memory leak.
On Tue, Mar 22, 2016 at 11:54 AM, Wink, Markus <address@hidden> wrote:
True.. sorry for that.
I guess I found the line in my script that is causing the error. I was aiming to save the state of the fluid (lbfluid load_ascii_checkpoint). When calling that, the maximum memory is exceeded.
Do you have a rule of thumb, how much memory the lbfluid load_ascii_checkpoint command needs on the GPU (maybe as a function of simulation box-size)?
Greetings
Markus
Von: address@hidden [mailto:address@hidden] Im Auftrag von Georg Rempfer
Gesendet: Dienstag, 22. März 2016 11:48
An: Wink, Markus
Cc: address@hidden
Betreff: Re: [ESPResSo-users] Cuda Memory Error
I assume by RAM you mean the memory of the GPU?
On Tue, Mar 22, 2016 at 11:22 AM, Wink, Markus <address@hidden> wrote:
Hello everybody,
I want to simulate a quite big system (1200x300x130 LB-nodes) on a GPU. The Ram is sufficient (12GB) and I can start the simulation. Nevertheless after a few integration steps the simulation stops with the error message shown at the bottom of the mail.
I checked the GPU’s memory handling during the simulation and I realized, that the memory, that is needed for the simulation increases with time (the simulation crashes when there is no memory left on the GPU).
What is the reason, that the memory needed increases with time? Is there a asymptotic maximum value for the memory needed? Can I somehow avoid the increase?
Greetings
Markus
Cuda Memory error at /home/wink/Dokumente/espresso-master/20150804_fixed/espresso-master/src/core/lbgpu_cuda.cu:3572.
CUDA error: invalid argument
You may have tried to allocate zero memory at /home/wink/Dokumente/espresso-master/20150804_fixed/espresso-master/src/core/lbgpu_cuda.cu:3572.
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMMUNICATOR 3
with errorcode -1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[Prev in Thread] | Current Thread | [Next in Thread] |