[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Spreading parallel across nodes on HPC system
From: |
Ken Mankoff |
Subject: |
Re: Spreading parallel across nodes on HPC system |
Date: |
Fri, 11 Nov 2022 08:02:50 +0100 |
User-agent: |
mu4e 1.8.10; emacs 27.1 |
Hello,
On 2022-11-10 at 21:27 +01, Christian Meesters <meesters@uni-mainz.de> wrote:
> https://mogonwiki.zdv.uni-mainz.de/dokuwiki/start:working_on_mogon:workflow_organization:node_local_scheduling#running_on_several_hosts
That example uses "SLURM_CPUS_PER_TASK". From
https://slurm.schedmd.com/sbatch.html
SLURM_CPUS_PER_TASK
Number of cpus requested per task. Only set if the --cpus-per-task option
is specified.
I am not specifying --cpus-per-task, in part because it is more efficient to
let SLURM handle that. I request 32 tasks, but I may get 2 nodes with 16 CPUs
each, or 2 nodes, one with 31 CPUs and one with 1 CPU, or any other combo (10
nodes, 5 CPU on 1, 3 on the rest)
The problem is trivial if I require specific I am more specific to SLURM, for
example --nodes=1 --ntasks=32 to force all cores to be on the
same node. But that will make my jobs harder to queue.
-k.