[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Spreading parallel across nodes on HPC system

From: Christian Meesters
Subject: Re: Spreading parallel across nodes on HPC system
Date: Fri, 11 Nov 2022 08:42:44 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0

On 11/11/22 08:05, Ken Mankoff wrote:
Hi Rob,

On 2022-11-10 at 21:21 +01, Rob Sargent <> wrote:
I do this, in slurm bash script, to get the number of jobs I want to
run (turns out it's better for me to not load the full hyper-threaded

   cores=`grep -c processor /proc/cpuinfo`
   cores=$(( $cores / 2 ))

   parallel --jobs $cores etc :::: <file with list of jobs>

or sometimes the same jobs many times with

   parallel --jobs $cores etc ::: {1..300}
I apologize if I am missing something, but I don't see how this solves distributing to different hosts (nodes), where each host may have a different number of CPUs or cores.


Quite simple: The slurmstepd is aware of the cluster's configuration. It is hence distributing accordingly to each srun's instance resource demands. srun, however, is greedy (it is per default the MPI starter). Hence, the '--mem-per-cpu' restriction per call.

As to the question of the environment variable `SLURM_CPUS_PER_TASK`: If your cluster is configured as usual, it is '1' per default. If not, leave it away. For one core application its not necessary. As said: it's a template.

Best regards


reply via email to

[Prev in Thread] Current Thread [Next in Thread]