Column |
---|
Note | ||
---|---|---|
| ||
The content of this section is currently being updated to include material relevant for Phase-2 of Setonix and the use of GPUs. |
Setonix is Pawsey's flagship supercomputer based on the HPE Cray EX architecture that was commissioned in 2020 and delivered in two phases over the course of 2022 and 2023.
...
Type | N. Nodes | CPU | Cores Per Node | RAM Per Node | GPUs Per Node |
---|---|---|---|---|---|
Login | 49 | AMD Milan | 2x 64 | 256GB | n/a |
CPU computing504 (Phase 1) | 1592 (Total) | AMD Milan (2.45GHz, 280W) | 2x 64 | 256GB | n/a |
CPU high memory | 8 | AMD Milan (2.45GHz, 280W) | 2x 64 | 1TB | n/a |
GPU computing | 154 (Phase 2) | AMD Trento | 1 x 64 | 256GB | 8 GCDs (from 4x "AMD MI250X" cards, each card with 2 GCDs) |
GPU high memory | 38 (Phase 2) | AMD Trento | 1 x 64 | 512GB | 8 GCDs (from 4x "AMD MI250X" cards, each card with 2 GCDs) |
Data movement | 811 | AMD 7502P | 1x 32 | 128Gb | n/a |
More details regarding the hardware architecture and filesystems are made available in the sections below.
...
A CPU compute node has 2 AMD Milan EPYC CPUs with 64 cores each and 256Gb 256GB of RAM. The 64 cores of a Zen3 AMD CPU (shown in Figure 2 below) are evenly distributed across eight Core Chiplet Dies (CCD), each of which has 32Mb of L3 cache shared among all the cores on that CCD (shown in Figure 3 below). There is no limitation on the use of the L3 cache by a single Zen3 core, that can use up all of it. The Zen3 CPU is composed of 8 such CCDs, all connected to an additional memory and I/O controller die through the AMD Infinity Fabric. There are 8 memory channels, each with up to RAM circuits (DIMMS). The CPU supports 128 lanes of PCIe gen4 and up to 32 SATA or NVMe direct connect devices. Every two CCDs form a NUMA region. For more information about NUMA regions check the output of the lstopo-no-gui
program.
...
Each GPU compute node has one AMD Trento EPYC CPU with 64 cores and 256Gb 256GB of RAM. The Trento CPU architecture is similar to the Milan CPUs in the CPU nodes, with additional support for Infinity Fabric links to the four AMD MI250X GPU cards. Each MI250X card has two "logical GPUs" for a total of 8 GPUs per node. The node architecture of the Setonix GPU nodes is pictured in Figure 4 below. Each L3 cache region is connected to a logical GPU in the MI250X GPU cards via Infinity Fabric connections. These GPUs are also closely inter-connected via numerous Infinity Fabric links, and also connect to the Slingshot NIC cards for data transfer between nodes.
Section | ||
---|---|---|
|
Note that each MI250X has two Global Graphics Compute Die Dies (GCD) that are accessible as two logical GPUs, for a total of eight per node.
Note | ||||||
---|---|---|---|---|---|---|
| ||||||
A MI250x GPU card has two GCDs. Previous generations of GPUs only had 1 GCD per GPU card, so these terms could be used interchangeably. The interchangeable usage continues even though now GPUs have more than one GCD. Slurm for instance only use the GPU terminology when referring to accelerator resources, so requests such as |
The GCD architecture is shown in Figure 5 below, and consists of 110 compute units Compute Units (CU) (for 220 per MI250X, or 880 per node) with 64GB of GPU memory (for 128GB per MI250X, or 512GB per node).
...
Each Compute Unit contains 64 stream processors Stream Processors and 4 matrix coresMatrix Cores, as shown below in Figure 6.
...
For more detail regarding the MI250X GPU architecture, refer to https://www.amd.com/en/technologies/cdna2.the AMD CDNA 2 Architecture Whitepaper.
Anchor | ||||
---|---|---|---|---|
|
...
Mount point | Variable | Type | Size | Description |
---|---|---|---|---|
|
| Lustre filesystem | 14.4PB | A high-performance parallel filesystem for data processing. |
|
| Lustre filesystem | 393TB | Where system and user software are installed. |
|
| NFS | 92TB | Storage relatively small numbers of important system files such as your Linux profile and shell configuration. |
| 2.8PB | Filesystem dedicated to astronomy research. |
...
Partition Charge Rate ✕ Max(Cores Proportion, Memory Proportion, GPU Proportion) ✕ N. of nodes requested ✕ Job Elapsed Time (Hours).
...
- Partition Charge Rate is a constant value associated with each Slurm partition,
- Core proportion is the number of CPU cores per node requested divided by the total number of CPU cores per node,
- Memory proportion is the amount of memory per node requested divided by the total amount of memory available per node,
- GPU proportion is the amount of GPUs requested divided by the total amount of GPUs available per node (remember that for slurm, each GPU is equivalent to a GCD, so each GPU-node has 8 available GPUs to be requested).
For Setonix CPU nodes, the charge rate is 128 SU per node hour, as each CPU node has 128 cores.
For Setonix GPU nodes, the charge rate is 512 SU per node hour, based on the difference in energy consumption between the CPU and GPU node architectures. Since there are fewer GPU nodes than CPU nodes, these GPU nodes are to be used solely for GPU-enabled codes. Thus, resource requests on GPU nodes are slightly different to CPU nodes as all requests are in units of GCDs, with 1 GCD = 1 Slurm GPU. Requests cannot be made based on memory but must be based on the number of GPUs to be used.
Maintenance
Due to the cutting-edge nature of Setonix, regular and frequent updates of the software stack is expected during the first year of Setonix's operation as further optimisations and improvements are made available.
...