...
When to use job arrays: For nodes that can be shared (like gpuq pertition in Topaz) , the best practice is to use job arrays. A disadvantage of job packing on shared nodes is that unbalanced steps might lead to resources being held unnecessarily. When using arrays this problem does not exist because, as soon as any job finishes or fails, the resources for that job are freed for use by another user.
...
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Listing 21. GPU job array example |
---|
| #!/bin/bash --login
#SBATCH --account=[your-account]-gpu
#SBATCH --array=0-7
#SBATCH --partition=gpuqgpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=80G
#SBATCH --gres=gpu:gpu=1
#SBATCH --time=00:10:00
#Default loaded compiler
module
is
gcc module
module load cuda
#Go to the right directory for this instance of the job array using SLURM_ARRAY_TASK_ID as the identifier:
#We are assuming all the input files needed for each specific job reside in the corresponding working directory
cd workingDir_${SLURM_ARRAY_TASK_ID}
#Run the cuda executable (asuming the same executable will be used by each job, and that it resides in the submission directory):
srun -u -N 1 -n 1 -c 1 ${SLURM_SUBMIT_DIR}/main_cudahip |
|
When to use job packing: For nodes where resources are exclusive and cannot be shared among different users/jobs at the same time (like nvlinkq partition in Topaz) the best practise is to to use job packing. Ideally, multiple jobs should be packed in order to make use of the four available GPUs in the node. (Obviously if a single job can make use of the four GPUs, that is also desirable and that would not need packing.) We do not recommend packing jobs across multiple nodes with the same job script due to possible load balancing issues: all resources will be held and unavailable to other users/jobs until the last substep (job) in the packing finishes.
...
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Listing 22. GPU job packing example using multiple steps simultaneously |
---|
| #!/bin/bash --login
#SBATCH --account=[your-account]-gpu
#SBATCH --partition=nvlinkqgpu
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --ntasks-per-socket=2 #maximum 2 tasks per socket (each socket has 2 GPUs in this partition)
#SBATCH --cpus-per-task=1
#SBATCH --gres=gpu:4
#SBATCH --time=00:10:00
#Default loaded compiler module is gcc module
module load cuda
for tagID in $(seq 0 3); do
#Go to the right directory for this step of the job pack using tagID as the identifier:
#We are assuming all the input files needed for each specific job reside in the corresponding working directory
cd ${SLURM_SUBMIT_DIR}/workingDir_${tagID}
#Defining an output file for this step
outputFile=results_${tagID}.out
echo "Starting" > $outputFile
#Run the cuda executable (asuming the same executable will be used by each step, and that it resides in the submission directory):
srun -u -N 1 -n 1 --mem=56G --gres=gpu:1 --exact ${SLURM_SUBMIT_DIR}/main_cuda >> $outputFile &
done
wait |
Note |
---|
| - In the header a total of four GPUS is requested. For each job step the specific number of GPUs to be used (1 in this case) is indicated. The use of
--mem=56G indicates the amount of memory to be allocated for each step, and the --exact allows access to only the resources requested for the step. - Note the logic of the use of "
& .. & ..wait " for being able to execute each step in the background and wait for them to finish before ending the job script. - In the loop, the iterator (numeric identifier) for each step is defined to start at 0 in order to be equivalent to the natural numbering of Slurm, but you can use any start and end value to be consistent with your own naming of directories, input files and output files.
|
|
...