A MI250x GPU card has two GCDs. Previous generations of GPUs only had 1 GCD per GPU card, so these terms could be used interchangeably. The interchangeable usage continues even though now GPUs have more than one GCD. Slurm for instance only use the GPU terminology when referring to accelerator resources, so requests such as --gres=gpu:number is equivalent to a request for a certain number of GCDs per node. On Setonix, the max number is 8. (Note that the "equivalent" option --gpus-per-node=number is not recommended as we have found some bugs with its use.) Furthermore, Pawsey DOES NOT use standard Slurm meaning for theĀ --gres=gpu:number parameter. The meaning of this parameter has been superseeded to represent the request for a number of "allocation-packs". The new representation has been implemented to achieve best performance. Therefore, the current allocation method uses as the "allocation-pack" as the basic allocation unit and, as explained in the rest of this document, users should only request for the number of "allocation-packs" that fullfill the needs of the job. Each allocation-pack provides: - 1 whole CPU chiplet (8 CPU cores)
- ~32 GB memory (1/8 of the total available RAM)
- 1 GCD (slurm GPU) directly connected to that chiplet
|