Changes in Supercomputing Services for 2023

This page outlines main changes in Pawsey's next-generation supercomputing services. 

On this page:

Setonix will experience downtime in the beginning of January 2023 while final acceptance testing is carried out


Background

In 2018 the Australian Government awarded $70 million to upgrade Pawsey’s supercomputing infrastructure. As part of the Pawsey Capital Refresh project, Pawsey deployed and made available Setonix Phase1 CPU system in June 2022. Full scale Setonix Phase 2 system, as rendered on Figure 1 below, will be made available to researchers for 2023 allocation round. This page provides a summary of main changes to Pawsey supercomputing services in 2023, including changes in allocation schemes.   


Figure 1. Render of HPE Cray EX system Setonix

The Supercomputer

Setonix will accommodate all new allocations starting from 2023 allocation round. Table below presents the overview of Setonix Phase 1 and Phase 2 resources.


Table 1. Setonix Phase 1 and Phase 2 resources explained.

Purpose

Nodes

CPU

Cores Per Node

GPU

RAM Per Node

Availability 

Setonix Phase 1

Log In

4

2x AMD EPYC 7713 "Milan" 2x 64
256GBAvailable from June 2022

CPU computing

504

2x AMD EPYC 7763 "Milan"

2x 64


256GBAvailable from June 2022
CPU High mem82x AMD EPYC 7763 "Milan"2x 64
1TBAvailable from June 2022
Data movement82x AMD 7502P1x 32
128GBAvailable from June 2022
Setonix Phase 2 will add the following
Log In

6

(total: 10)

2x AMD EPYC 7713 "Milan" 2x 64
256GBAvailable from 2023 allocation round
CPU computing

1088

(total: 1592) 

2x AMD EPYC 7763 "Milan"2x 64
256GBAvailable from 2023 allocation round
GPU computing1541x AMD optimised 3rd Gen EPYC "Trento"1x 648 GCDs (from 4x "AMD MI250X" cards, each card with 2 GCDs), 128 GB HBM2e 256GBAvailable from 2023 allocation round
GPU High mem381x AMD optimised 3rd Gen EPYC "Trento"1x 648 GCDs (from 4x "AMD MI250X" cards, each card with 2 GCDs), 128 GB HBM2e 512GBAvailable from 2023 allocation round
Data movement

8

(total: 16)

2x AMD 7502P

1x 32
128GBAvailable from 2023 allocation round

The Software

Pawsey has developed new Software Stack Policies which describe the principles behind the configuration, maintenance and support of the scientific software stack on Pawsey systems.

The List of Supported Software provides an overview of software that is centrally installed and supported at Pawsey.

The Accounting Model

With Setonix, Pawsey is moving from an exclusive node usage to a proportional node usage accounting model. While the Service Unit (SU) is still mapped to the hourly usage of CPU cores, users are not charged for whole nodes irrespective of whether they are been fully utilised. With the proportional node usage accounting model, users are charged only for the portion of a node they requested.

Each CPU compute node of Setonix can run multiple jobs in parallel, submitted by a single user or many users, from any project. Sometimes this configuration is called shared access.

A project that has entirely consumed its service units (SUs) for a given quarter of the year will run its jobs in low priority mode, called extra, for that time period. Furthermore, if its service unit consumption for that same quarter hits the 150% usage mark, users of that project will not be able to run any more jobs for that quarter.

Pawsey accounting model bases the GPU charging rate on energy consumption. Such approach, designed for Setonix, has a lot of advantages compared to other models, introducing carbon footprint as a primary driver in determining the allocation of computational workflow on heterogeneous resources.

Pawsey and NCI centres are using slightly different accounting models. Researchers applying for allocations on Setonix and Gadi should refer to Table 2 when calculating their allocation requests. 


Table 2. Setonix and Gadi service unit models

Resources usedService Units

Gadi

CPU: 48 Intel Cascade Lake cores per node

GPU: 4 Nvidia V100 GPUs per node

Setonix

CPU: 128 AMD Milan cores per node

GPU: 4 AMD MI250X GPUs per node

1 CPU core / hour21
1 CPU / hour4864
1 CPU node / hour96128
1 GPU / hour36*128
1 GPU node / hour144*512

 * calculated based on https://opus.nci.org.au/display/Help/2.2+Job+Cost+Examples for gpuvolta queue

How to estimate Service Units request for Setonix-GPU?

Researchers planning their migration from NVIDIA-based GPU systems like NCI’s Gadi to AMD-based Setonix-GPU should use the following example strategy to calculate their Service Units request.   

  • Simulation walltime on a single NVIDIA V100 GPU: 1h
  • Safe estimate for Service Units usage on a single Setonix’s AMD MI250X GPU: 1h * 1/2 * 128 = 64 Service Units 

Please see: https://www.amd.com/en/graphics/server-accelerators-benchmarks 

Setonix-GPU migration pathway

The Setonix’s AMD MI250X GPUs have a very specific migration pathway related to CUDA to HIP and OpenACC to OpenMP conversions. Pawsey is working closely with research groups within PaCER project (https://pawsey.org.au/pacer/) and with vendors to further extend the list of supported codes. 

Please see: https://www.amd.com/en/technologies/infinity-hub  

The Allocation Schemes

Compute-time merit allocations on Setonix may be obtained through the following schemes:

  • The National Computational Merit Allocation Scheme (NCMAS) – This scheme operates annual allocation calls open to the whole Australian research community and provides substantial amounts of compute time for meritorious, computational-research projects.
  • The Pawsey Partner Merit Allocation Scheme – This scheme operates annual calls open to researchers in Pawsey Partner institutions and provides significant amounts of compute time for meritorious, computational research projects. The Partner institutions are CSIRO, Curtin University, Edith Cowan University, Murdoch University and The University of Western Australia. There is an out-of-session application process for newly eligible project leaders.
  • The new Preparatory Access Scheme is now available for researchers preparing their applications to merit allocations schemes. It is designed to support feasibility studies and benchmarking. More information about this scheme: Preparatory Access Scheme.

Single application to National Computational Merit Allocation Scheme (NCMAS) and Pawsey Partner Merit Allocation Scheme schemes can now include Setonix-CPU and Setonix-GPU requests. Researchers can apply only for Setonix-CPU allocation, only for Setonix-GPU allocation or for both.

The minimum allocation request for National Computational Merit Allocation Scheme (NCMAS) and Pawsey Partner Merit Allocation Scheme is 1M Service Units.

Pawsey Partner allocation top-ups will not be offered starting from 2023 allocation round. Researchers can submit their applications to both schemes separately. New application form for Pawsey Partner scheme allows the reuse of documents submitted to NCMAS. 

Pawsey has improved its technical review process. Scalability criteria is now covering: CPU scalability, GPU scalability as well as data-centric workflows scalability. 


Table 3. Resources available on Setonix for the 2023 allocation round

Scheme

Request

full year

National Computational Merit Allocation Scheme

Scheme total capacity

455M Service Units Total:

  • 295M Service Units on Setonix-CPU
  • 160M Service Units on Setonix-GPU
Minimum request size

1M Service Units

Pawsey Partner Merit Allocation SchemeScheme total capacity

540M Service Units

  • 350M Service Units on Setonix-CPU
  • 190M Service Units on Setonix-GPU
Minimum request size1M Service Units

The Storage

There are number of changes to File Management on Setonix.

All Supercomputing, Nimbus and Visualisation projects are granted with 1TB allocation on Acacia, which is shared amongst all project members. If you require more than 1TB, but less than 10TB, please email an appropriate request to the Pawsey helpdesk. If you require more than 10TB, then you will need to submit an application for Managed Storage in Data Services.

External links