...
#SBATCH --nodes=1 #1 nodes in this example
#SBATCH --gpus-per-node=3 #3 GPUs per node (3 "allocation-packs" in total for the job)
...
Example scripts for: Packing GPU jobs
...
Packing the execution of 8 independent instances each using 1 GCD (logical/Slurm GPU)
This kind of packing can be performed with the help of an additional job-packing-wrapper script (jobPackWrapperjobPackingWrapper.sh
) that rules the independent execution of different codes (or different instances of the same code) to be ran by each of the srun-tasks spawned by srun
. (It is important to understand that these instances do not interact with each other via MPI messaging.) The isolation of each code/instance should be performed via the logic included in this job-packpacking-wrapper script.
In the following example, the job-packpacking-wrapper creates 8 different output directories and then launches 8 different instances of the hello_nompi
code. The output of each of the executions is saved in a different case directory and file. In this case, the executable do not receive any further parameters but, in practice, users should define the logic for their own purposes and, if needed, include the logic to receive different parameters for each instance.
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Listing N. jobPackWrapperjobPackingWrapper.sh |
---|
linenumbers | true |
---|
| #!/bin/bash
#Job PackPacking Wrapper: Each srun-task will use a different instance of the executable.
# For this specific example, each srun-task will run on a different case directory
# and create an isolated log file.
# (Adapt wrapper script for your own purposes.)
caseHere=case_${SLURM_PROCID}
echo "Executing job-packing-wrapper instance with caseHere=${caseHere}"
exeDir=${MYSCRATCH}/hello_jobstep
exeName=hello_nompi #Using the no-MPI version of the code
theExe=${exeDir}/${exeName}
logHere=log_${exeName}_${SLURM_JOBID}_${SLURM_PROCID}.out
mkdir -p $caseHere
cd $caseHere
${theExe} > ${logHere} 2>&1 |
|
Note that besides the use of the additional job-packing-wrapper, the rest of the script is very similar to the single-node exclusive examples given above. As for all scripts, we provide the parameters for requesting the necessary "allocation-packs" for the job. This example considers a job that will make use of the 8 GCDs (logical/Slurm GPUs) on 1 node (8 "allocation-packs"). Each allocated-pack of GPU resources will be used by each of the instances controlled by the job-packing-wrapper. The resources request use the following two parameters:
#SBATCH --nodes=1 #1 node in this example
#SBATCH --exclusive #All resources of the node are exclusive to this job
# #8 GPUs per node (8 "allocation-packs" in total for the job)
Note that only these two allocation parameters are needed to provide the information for the requested number of allocation-packs, and no other parameter related to memory or CPU cores should be provided in the request header.
The use/management of the allocated resources is controlled by the srun
options and some environmental variables. For srun
, this is not different to an MPI job with 8 tasks. But in reality, this is not an MPI job. On the contrary, srun
will spawn 8 tasks, each one of them executing the job-packing-wrapper, but the logic of the job-packing-wrapper allows for 8 independent executions of the desired code(s).
...
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Listing N. exampleScript_1NodeExclusive_8GPUs_jobPacking.sh |
---|
linenumbers | true |
---|
| #!/bin/bash --login
#SBATCH --job-name=JobPack8GPUsExclusiveJobPacking8GPUsExclusive-bindMethod1
#SBATCH --partition=gpu
#SBATCH --nodes=1 #1 nodes in this example
#SBATCH --exclusive #All resources of the node are exclusive to this job
# #8 GPUs per node (8 "allocation -packs" in total for the job)
#SBATCH --time=00:05:00
#SBATCH --account=<yourProject>-gpu #IMPORTANT: use your own project and the -gpu suffix
#----
#Loading needed modules (adapt this for your own purposes):
module load PrgEnv-cray
module load rocm craype-accel-amd-gfx90a
echo -e "\n\n#------------------------#"
module list
#----
#Printing the status of the given allocation
echo -e "\n\n#------------------------#"
echo "Printing from scontrol:"
scontrol show job ${SLURM_JOBID}
#----
#Job PackPacking Wrapper: Each srun-task will use a different instance of the executable.
jobPackWrapperjobPackingWrapper="jobPackWrapperjobPackingWrapper.sh"
#----
#MPI & OpenMP settings
#No need for 1GPU steps:export MPICH_GPU_SUPPORT_ENABLED=1 #This allows for GPU-aware MPI communication among GPUs
export OMP_NUM_THREADS=1 #This controls the real CPU-cores per task for the executable
#----
#Execution
#Note: srun needs the explicit indication full parameters for use of resources in the job step.
# These are independent from the allocation parameters (which are not inherited by srun)
echo -e "\n\n#------------------------#"
echo "Test code execution:"
srun -l -u -N 1 -n 8 -c 8 --gpus-per-node=8 --gpus-per-task=1 --gpu-bind=closest ./${jobPackWrapperjobPackingWrapper}
#----
#Printing information of finished job steps:
echo -e "\n\n#------------------------#"
echo "Printing information of finished jobs steps using sacct:"
sacct -j ${SLURM_JOBID} -o jobid%20,Start%20,elapsed%20
#----
#Done
echo -e "\n\n#------------------------#"
echo "Done" |
|
After execution of the main slurm bash script, 8 case directories are created (each one of them tagged with their corresponding SLURM_PROCID)
. And within each of them there is a log file corresponding the execution of each instance that ran according to the logic of the jobPackWrapperjobPackingWrapper.sh
script:
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | DJango |
---|
title | Terminal N. Output for a single job ( on 1 node exclusive) that packs the execution of 8 independet instances |
---|
| $ sbatch exampleScript_1NodeExclusive_8GPUs_jobPacking.sh
Submitted batch job 339328
$ startDir=$PWD; for iDir in $(ls -d case_*); do echo $iDir; cd $iDir; ls; cat *; cd $startDir; done
case_0
log_hello_nompi_339328_0.out
MAIN 000 - OMP 000 - HWT 002 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID d1
case_1
log_hello_nompi_339328_1.out
MAIN 000 - OMP 000 - HWT 009 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID d6
case_2
log_hello_nompi_339328_2.out
MAIN 000 - OMP 000 - HWT 017 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID c9
case_3
log_hello_nompi_339328_3.out
MAIN 000 - OMP 000 - HWT 025 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID ce
case_4
log_hello_nompi_339328_4.out
MAIN 000 - OMP 000 - HWT 032 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID d9
case_5
log_hello_nompi_339328_5.out
MAIN 000 - OMP 000 - HWT 044 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID de
case_6
log_hello_nompi_339328_6.out
MAIN 000 - OMP 000 - HWT 049 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID c1
case_7
log_hello_nompi_339328_7.out
MAIN 000 - OMP 000 - HWT 057 - Node nid001004 - RunTime_GPU_ID 0 - ROCR_VISIBLE_GPU_ID 0 - GPU_Bus_ID c6 |
|
...