...
Some applications may requiere access that each of the spawned task to have access to multiple GPUs. In this case, some optimal binding and communication can still be granted by the scheduler when assigning resources to with the srun
launcher. Although final responsability for the optimal use of the resources in multiple GPUs assigned to each task relies on the code itself.
As for all scripts, we provide the parameters for requesting the necessary "allocation-packs" for the job. This example considers a job that will make use of the 6 GCDs (logical/Slurm GPUs) on 1 node (6 "allocation-packs" in total). The resources request use the following two parameters:
#SBATCH --nodes=1 #1 node in this example
#SBATCH --gres=gpu:6 6 #6 GPUs per node (6 "allocation packs" in total for the job)
...
The use/management of the allocated resources is controlled by the srun
options and some environmental variables. As mentioned above, some optimal best binding can still be achieved by the scheduler providing 2 GPUs to each of the tasks:
...