...
Column |
---|
Table 2. List of options for PBS Pro and Slurm Option | PBS Pro | Slurm |
---|
Script directive | #PBS | #SBATCH | Job name | -N [name] | --job-name=[name] | Queue | -q [queue] | --partition=[queue] | Accounting | -W group_list=[acct] | --account=[acct] | Wall clock limit | -l walltime=[hh:mm:ss] | --time=[hh:mm:ss] | Select | -l select=[chunk] | --nodes=[chunk] | Node count | -l nodes=[count] | --nodes=[count] | CPU count | -l mpiprocs=[count]
-l ppn=[count]
-l mppwidth=[count] | --ntasks-per-node=[count]
(alternatively use --ntasks -per-node=[count]option) | OpenMP threads | -l ompthreads=[nthr] | --cpus-per-task=[nthr] | Memory size | -l mem=[MB] | --mem=[mem][M|G|T]
--mem-per-cpu=[mem][M|G|T] | Standard output file | -o [filename] | --output=[filename] | Standard error file | -e [filename] | --error=[filename] | Combine stdout/stderr | -j oe (to stdout) | (this is the default behaviour if --output is used without –-error ) | Copy environment | -V | --export=ALL (default) | Copy environment variable | -v [var] | --export=var | Job dependency | -W depend=[state:jobid] | --dependency=[state:jobid] | Event notification | -m abe | --mail-type=[events] | Email address | -M [address] | --mail-user=[address] | GPU count | -l ngpus=[count] | --gpus-per-task=[count] |
|
To convert the PBS Pro select statement, use the --nodes
and --ntasks-per-node
options. For example, listing 2 shows the Slurm equivalent of the PBS Pro directive shown in listing 1.
...
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Listing 4. A simple Slurm batch script |
---|
linenumbers | true |
---|
| #!/bin/bash -l
# 10 nodes, 2432 MPI processes/node, 240320 MPI processes total
#SBATCH --job-name="myjob"
#SBATCH --time=02:00:00
#SBATCH --ntasks=240320
#SBATCH --ntasks-per-node=24=32
#SBACTH --cpus-per-task=1
#SBATCH --mem=58G
#SBATCH --output=myjob.%j.o
#SBATCH --error=myjob.%j.e
#SBATCH --account=projectcode
#SBATCH --export=NONE
#======START=====
echo "The current job ID is $SLURM_JOB_ID"
echo "Running on $SLURM_JOB_NUM_NODES nodes"
echo "Using $SLURM_NTASKS_PER_NODE tasks per node"
echo "A total of $SLURM_NTASKS tasks is used"
echo "Node list:"
sacct --format=JobID,NodeList%100 -j $SLURM_JOB_ID
# -----Executing command:
srun --export=ALL -uu -N $SLURM_JOB_NUM_NODES -n $SLURM_NTASKS -c $SLURM_CPUS_PER_TASK ./a.out
#=====END==== |
|
Line 1 invokes the shell (bash), and line 2 is a comment.
Lines 3 to 10 12 contain the script directives. Line 3 gives the job a name. Line 4 requests 2 hours of walltime. Line 5 requests 240 320 MPI processes, and line 6 requests 24 32 processes per node. Line 7 specifies the number of cores per MPI tasks. Line 8 specifies the amount of memory needed per node. Line 9 specifies the name of the output file (%j
is the job number), and line 8 10 specifies the file to which errors should be written out. Line 9 11 gives the account to which this walltime should be charged.
Line 11 12 is an optional separator between the script directive preamble and the actions in the script.
Lines 1214-17 19 are useful (but optional) diagnostic information that will be printed out.
Line 18 invokes 22 invokes srun
to run the code (./a.out
).
Line 19 is an Lines 13,21 & 23 are optional separator/comments marking the end sections of the script.
From this script, you can see that the same concepts you already know from PBS Pro apply to Slurm as well. You must tell the scheduler how long the job will run for, and how many processors are required. You provide it with project accounting information. And there are other optional arguments that help the script to run. Within the body of the script, we can invoke the usual scripting utilities such as echo
, and launch our application with the usual commands.
...
Column |
---|
Table 3. Common environment variables in PBS Pro and Slurm Environment Variable | PBS Pro | Slurm |
---|
Job ID | PBS_JOBID | SLURM_JOB_ID | Submit directory | PBS_O_WORKDIR | SLURM_SUBMIT_DIR*† | Submit host | PBS_O_HOST | SLURM_SUBMIT_HOST | Node list | PBS_NODEFILE | SLURM_JOB_NODELIST‡ | Job Array Index | PBS_ARRAY_INDEX | SLURM_ARRAY_TASK_ID |
Note |
---|
| * PBS_O_WORKDIR and SLURM_SUBMIT_DIR both contain the name of the working directory from which the user submitted the job. When using Slurm it is not necessary to explicitly change to this directory, as this is done by default. † When the --export=NONE option is used (as recommended) it is not defined. ‡ PBS_NODEFILE points to a file containing the nodes allocated to the job. SLURM_JOB_NODELIST contains a regular expression listing the nodes. For example:
SLURM_JOB_NODELIST=nid000[32-39]
To expand the nodes explicitly, use the following command: scontrol show hostnames $SLURM_JOB_NODELIST
|
|
...