...
Column | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Node architecture
The GPU node architecture is different from that on the CPU-only nodes. The following diagram shows the connections between the CPU and GPUs on the node, which will assist with understanding recommendations for Slurm job scripts later on this page. Note that the numbering of the cores of the CPU has a slightly different order to that of the GPUs.
...
Column | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||
|
(Note that the wrapper need to have execution permissions. The command: "chmod 755 selectGPU_X.sh", or similar will do the job for that.)
The wrapper script defines the value of the ROCm environment variable variable ROCR_VISIBLE_DEVICES
with the value of the Slurm environment variable SLURM_LOCALID
. It then executes the rest of the parameters given to the script which are the usual execution instructions for the program intended to be executed. The SLURM_LOCALID
variable has the identification number of the task within each of the nodes (not a global identification, but an identification number local to the node). Further details about the variable are available in the Slurm documentation.
...