...
Column |
---|
Note |
---|
Check this page regularly as it will be updated frequently over the incoming months as the deployment of the software progresses. |
|
...
In particular, currently: - GPU supported software modules are still in the process of being deployed
|
|
This page summarises the information needed to start using the Setonix GPU partitions.
...
The GPU partition of Setonix is made up of of 192 nodes, 38 of which are high memory nodes (512 GB RAM instead of 256GB). Each GPU node features 4 AMD MI250X GPUs, as depicted in figure Figure 1. Each MI250X comprises 2 Graphics Complex Die (GCD), with each effectively seen as a standalone GPU by the system. A 64-core AMD Trento CPU is connected to the four MI250X with the AMD InfinityFabric interconnect, the same interconnection between the GPU cards, with a peak bandwidth of 200Gb/s—more information at s. For more information refer to the Setonix General Information. Each GCD can access 64GB of GPU memory. This totals to 128GB per MI250X, and 256GB per standard GPU node.
Image Removed
Image Added
Figure 1. A GPU node of Setonix
...
Several scientific applications are already able to offload computations to the MI250X, many others are in the process of being ported to AMD GPUs. Here is a list of the main ones and their current status.
Support statusAMD GPU Acceleration | Module on Setonix |
---|
Amber | Yes |
SupportedSupportedSupportedSupportedSupportedSupportedNot supportedSupportedVASP | Porting in progress, no ETA | Table 1. List of popular applications applications. * indicates module is a container as module.
Module names of AMD GPU applications end with the postfix amd-mi250
xgfx90a
. The most accurate list is given by the module
command:
$ module avail mi-250x
gfx90a
Column |
---|
Note |
---|
| Tensorflow is available as container at the following location, /software/setonix/2022.11/containers/sif/amdih/tensorflow/rocm5.0-tf2.7-dev/tensorflow-rocm5.0-tf2.7-dev.sif
but no module has been created for it yet. |
|
Supported Numerical Libraries
Popular numerical routines and functions have been implemented by AMD to run on their GPU hardware. All of the following are available when loading the rocm/5.0.2
module modules.
...
AMD ROCm installations
The main default ROCm installation is rocm/5.02.23
provided by HPE Cray. In addition, Pawsey staff have installed the more recent versions up to rocm/5.47.3
from source using ROCm-from-source. It is an experimental installation and users might encounter compilation or linking errors. You are encouraged to explore it during development and to report any issues. For production jobs, however, we currently recommend sticking to rocm/5.0.2
.. We recommend the use of the latest available version unless it creates troubles in your code. Available versions can be checked with the command:
module avail rocm
.
Submitting Jobs
You can submit GPU jobs to the gpu
, gpu
and -dev
and gpu-highmem
Slurm partitions using your GPU allocation.
Note that you will need to use a different project code for the --account
/-A option. More specifically, it is your project code followed by the -gpu
suffix. For instance, if your project code is project1234
, then you will have to use project1234-gpu
.GPUs must be explicitly requested to Slurm using the --gres=gpu:<num_gpus>
, --gpus-per-task=<num_gpus>
or --gpu-per-node=<num_gpus>
options
Insert excerpt |
---|
| Example Slurm Batch Scripts for Setonix on GPU Compute Nodes |
---|
| Example Slurm Batch Scripts for Setonix on GPU Compute Nodes |
---|
nopanel | true |
---|
|
An extensive explanation on the use of the GPU nodes (including request by "allocation packs" and the "manual" binding) is in Example Slurm Batch Scripts for Setonix on GPU Compute Nodes.
Compiling software
If you are using ROCm libraries, such as rocFFT, to offload computations to GPUs, you should be able to use any compiler to link those to your code.
For HIP code as well as one use hipcc. And, for code making use of OpenMP offloading, you must use:
...
There is no OpenACC support currently available.
- for c/c++
ftn
(wrapper for cray-fortran from PrgEnv-cray) for fortran. This compiler also allows GPU offloading with OpenACC.
When using hipcc
, note that the location of the MPI headers and libraries are not automatically included (contrary to the automatic inclusion when using the Cray wrapper scripts). Therefore, if your code also requires MPI, the location of the MPI headers and libraries must be provided to hipcc
as well as the GPU Transport Layer libraries:
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | MPI include and library flags for hipcc |
---|
| -I${MPICH_DIR}/include
-L${MPICH_DIR}/lib -lmpi
-L${CRAY_MPICH_ROOTDIR}/gtl/lib -lmpi_gtl_hsa |
|
Also to ensure proper use of GPU-GPU MPI communication codes must be compiled and run with the following environment variable set:
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | MPI environment variable for GPU-GPU communication |
---|
| export MPICH_GPU_SUPPORT_ENABLED=1 |
|
Accounting
Each MI250X GCD, which corresponds to a Slurm GPU, is charged 64SU 64 SU per hour. This means the use of an entire GPU node is charged 512SU 512 SU per hour. In general, a job is charged the largest proportion of core, memory, or GPU usage rounded up to 1/8ths of a node (corresponding to an individual MI250X GCD). Note that GPU node usage is accounted against GPU allocations with the -gpu
suffix, which are separate to CPU allocations.
Programming AMD GPUs
You can program AMD MI250X GPUs using HIP, which is the programming framework equivalent to the one of NVIDIA, CUDA. The HIP platform is available after having loaded the rocm
module.
...
.
The complete AMD documentation on how to program with HIP can be found here (external site).
Uptake Projects
...
Example Jobscripts
The following are some brief examples of requesting GPUs via Slurm batch scripts on Setonix. For more detail, particularly regarding using shared nodes and the CPU binding for optimal placement relative to GPUs, refer to Example Slurm Batch Scripts for Setonix on GPU Compute Nodes.
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Example 1 : One process with a single GPU using shared node access |
---|
linenumbers | true |
---|
| #!/bin/bash --login
#SBATCH --account=project-gpu
#SBATCH --partition=gpu
#SBATCH --nodes=1 #1 nodes in this example
#SBATCH --gres=gpu:1 #1 GPU per node (1 "allocation-pack" in total for the job)
#SBATCH --time=00:05:00
#----
#Loading needed modules (adapt this for your own purposes):
module load PrgEnv-cray
module load rocm craype-accel-amd-gfx90a
module list
#----
#MPI & OpenMP settings
export OMP_NUM_THREADS=1 #This controls the real number of threads per task
#----
#Execution
srun -N 1 -n 1 -c 8 --gres=gpu:1 ./program |
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Example 2 : Single CPU process that use the eight GPUs of the node |
---|
linenumbers | true |
---|
| #!/bin/bash --login
#SBATCH --account=project-gpu
#SBATCH --partition=gpu
#SBATCH --nodes=1 #1 nodes in this example
#SBATCH --exclusive #All resources of the node are exclusive to this job
# #8 GPUs per node (8 "allocation-packs" in total for the job)
#SBATCH --time=00:05:00
#----
#Loading needed modules (adapt this for your own purposes):
module load PrgEnv-cray
module load rocm craype-accel-amd-gfx90a
module list
#----
#MPI & OpenMP settings
export OMP_NUM_THREADS=1 #This controls the real CPU-cores per task for the executable
#----
#Execution
srun -N 1 -n 1 -c 64 --gres=gpu:8 ./program |
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Example 3 : Eight MPI processes each with a single GPU (use exclusive node access) |
---|
linenumbers | true |
---|
| #!/bin/bash --login
#SBATCH --account=project-gpu
#SBATCH --partition=gpu
#SBATCH --nodes=1 #1 nodes in this example
#SBATCH --exclusive #All resources of the node are exclusive to this job
# #8 GPUs per node (8 "allocation packs" in total for the job)
#SBATCH --time=00:05:00
#----
#Loading needed modules (adapt this for your own purposes):
module load PrgEnv-cray
module load rocm craype-accel-amd-gfx90a
module list
#----
#MPI & OpenMP settings
export MPICH_GPU_SUPPORT_ENABLED=1 #This allows for GPU-aware MPI communication among GPUs
export OMP_NUM_THREADS=1 #This controls the real number of threads per task
#----
#Execution
srun -N 1 -n 8 -c 8 --gres=gpu:8 --gpus-per-task=1 --gpu-bind=closest ./program
|
Note |
---|
title | Method 1 may fail for some applications. |
---|
| The use of --gpu-bind=closest may not work for all codes. For those codes, "manual" binding may be the only reliable method if they relying OpenMP or OpenACC pragma's for moving data from/to host to/from GPU and attempting to use GPU-to-GPU enabled MPI communication. Some codes, like {{OpenMM}}, also make use of the runtime environment variables and require explicitly setting ROCR_VISIBLE_DEVICES Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Setting visible devices manually |
---|
| export ROCR_VISIBLE_DEVICES=0,1 # selects the first two GCDS on GPU 1. |
|
|
Full guides