This page outlines main changes in Pawsey's next-generation supercomputing services.

On this page:

Setonix will experience downtime in the beginning of January 2023 while final acceptance testing is carried out

Background

In 2018 the Australian Government awarded $70 million to upgrade Pawsey’s supercomputing infrastructure. As part of the Pawsey Capital Refresh project, Pawsey deployed and made available Setonix Phase1 CPU system in June 2022. Full scale Setonix Phase 2 system, as rendered on Figure 1 below, will be made available to researchers for 2023 allocation round. This page provides a summary of main changes to Pawsey supercomputing services in 2023, including changes in allocation schemes.

Figure 1. Render of HPE Cray EX system Setonix

The Supercomputer

Setonix will accommodate all new allocations starting from 2023 allocation round. Table below presents the overview of Setonix Phase 1 and Phase 2 resources.

Table 1. Setonix Phase 1 and Phase 2 resources explained.

Purpose	Nodes	CPU	Cores Per Node	GPU	RAM Per Node	Availability
Setonix Phase 1
Log In	4	2x AMD EPYC 7713 "Milan"	2x 64		256GB	Available from June 2022
CPU computing	504	2x AMD EPYC 7763 "Milan"	2x 64		256GB	Available from June 2022
CPU High mem	8	2x AMD EPYC 7763 "Milan"	2x 64		1TB	Available from June 2022
Data movement	8	2x AMD 7502P	1x 32		128GB	Available from June 2022
Setonix Phase 2 will add the following
Log In	6 (total: 10)	2x AMD EPYC 7713 "Milan"	2x 64		256GB	Available from 2023 allocation round
CPU computing	1088 (total: 1592)	2x AMD EPYC 7763 "Milan"	2x 64		256GB	Available from 2023 allocation round
GPU computing	154	1x AMD optimised 3rd Gen EPYC "Trento"	1x 64	8 GCDs (from 4x "AMD MI250X" cards, each card with 2 GCDs), 128 GB HBM2e	256GB	Available from 2023 allocation round
GPU High mem	38	1x AMD optimised 3rd Gen EPYC "Trento"	1x 64	8 GCDs (from 4x "AMD MI250X" cards, each card with 2 GCDs), 128 GB HBM2e	512GB	Available from 2023 allocation round
Data movement	8 (total: 16)	2x AMD 7502P	1x 32		128GB	Available from 2023 allocation round

The Software

Pawsey has developed new Software Stack Policies which describe the principles behind the configuration, maintenance and support of the scientific software stack on Pawsey systems.

The List of Supported Software provides an overview of software that is centrally installed and supported at Pawsey.

The Accounting Model

With Setonix, Pawsey is moving from an exclusive node usage to a proportional node usage accounting model. While the Service Unit (SU) is still mapped to the hourly usage of CPU cores, users are not charged for whole nodes irrespective of whether they are been fully utilised. With the proportional node usage accounting model, users are charged only for the portion of a node they requested.

Each CPU compute node of Setonix can run multiple jobs in parallel, submitted by a single user or many users, from any project. Sometimes this configuration is called shared access.

A project that has entirely consumed its service units (SUs) for a given quarter of the year will run its jobs in low priority mode, called extra, for that time period. Furthermore, if its service unit consumption for that same quarter hits the 150% usage mark, users of that project will not be able to run any more jobs for that quarter.

Pawsey accounting model bases the GPU charging rate on energy consumption. Such approach, designed for Setonix, has a lot of advantages compared to other models, introducing carbon footprint as a primary driver in determining the allocation of computational workflow on heterogeneous resources.

Pawsey and NCI centres are using slightly different accounting models. Researchers applying for allocations on Setonix and Gadi should refer to Table 2 when calculating their allocation requests.

Table 2. Setonix and Gadi service unit models

Resources used	Service Units
Resources used	Gadi CPU: 48 Intel Cascade Lake cores per node GPU: 4 Nvidia V100 GPUs per node	Setonix CPU: 128 AMD Milan cores per node GPU: 4 AMD MI250X GPUs per node
1 CPU core / hour	2	1
1 CPU / hour	48	64
1 CPU node / hour	96	128
1 GPU / hour	36*	128
1 GPU node / hour	144*	512

* calculated based on https://opus.nci.org.au/display/Help/2.2+Job+Cost+Examples for gpuvolta queue

How to estimate Service Units request for Setonix-GPU?

Researchers planning their migration from NVIDIA-based GPU systems like NCI’s Gadi to AMD-based Setonix-GPU should use the following example strategy to calculate their Service Units request.

Simulation walltime on a single NVIDIA V100 GPU: 1h
Safe estimate for Service Units usage on a single Setonix’s AMD MI250X GPU: 1h * 1/2 * 128 = 64 Service Units

Please see: https://www.amd.com/en/graphics/server-accelerators-benchmarks

Setonix-GPU migration pathway

The Setonix’s AMD MI250X GPUs have a very specific migration pathway related to CUDA to HIP and OpenACC to OpenMP conversions. Pawsey is working closely with research groups within PaCER project (https://pawsey.org.au/pacer/) and with vendors to further extend the list of supported codes.

Please see: https://www.amd.com/en/technologies/infinity-hub

The Allocation Schemes

Compute-time merit allocations on Setonix may be obtained through the following schemes:

The National Computational Merit Allocation Scheme (NCMAS) – This scheme operates annual allocation calls open to the whole Australian research community and provides substantial amounts of compute time for meritorious, computational-research projects.
The Pawsey Partner Merit Allocation Scheme – This scheme operates annual calls open to researchers in Pawsey Partner institutions and provides significant amounts of compute time for meritorious, computational research projects. The Partner institutions are CSIRO, Curtin University, Edith Cowan University, Murdoch University and The University of Western Australia. There is an out-of-session application process for newly eligible project leaders.
The new Preparatory Access Scheme is now available for researchers preparing their applications to merit allocations schemes. It is designed to support feasibility studies and benchmarking. More information about this scheme: Preparatory Access Scheme.

Single application to National Computational Merit Allocation Scheme (NCMAS) and Pawsey Partner Merit Allocation Scheme schemes can now include Setonix-CPU and Setonix-GPU requests. Researchers can apply only for Setonix-CPU allocation, only for Setonix-GPU allocation or for both.

The minimum allocation request for National Computational Merit Allocation Scheme (NCMAS) and Pawsey Partner Merit Allocation Scheme is 1M Service Units.

Pawsey Partner allocation top-ups will not be offered starting from 2023 allocation round. Researchers can submit their applications to both schemes separately. New application form for Pawsey Partner scheme allows the reuse of documents submitted to NCMAS.

Pawsey has improved its technical review process. Scalability criteria is now covering: CPU scalability, GPU scalability as well as data-centric workflows scalability.

Table 3. Resources available on Setonix for the 2023 allocation round

Scheme		Request full year
National Computational Merit Allocation Scheme	Scheme total capacity	455M Service Units Total: 295M Service Units on Setonix-CPU 160M Service Units on Setonix-GPU
National Computational Merit Allocation Scheme	Minimum request size	1M Service Units
Pawsey Partner Merit Allocation Scheme	Scheme total capacity	540M Service Units 350M Service Units on Setonix-CPU 190M Service Units on Setonix-GPU
Pawsey Partner Merit Allocation Scheme	Minimum request size	1M Service Units

The Storage

There are number of changes to File Management on Setonix.

Pawsey Filesystems and their Usage provides a detailed description of filesystems available on Setonix,
Pawsey Object Storage: Acacia provides a detailed description of the research data storage service Acacia.

All Supercomputing, Nimbus and Visualisation projects are granted with 1TB allocation on Acacia, which is shared amongst all project members. If you require more than 1TB, but less than 10TB, please email an appropriate request to the Pawsey helpdesk. If you require more than 10TB, then you will need to submit an application for Managed Storage in Data Services.

External links

Supercomputing Documentation
Acacia - User Guide
Cristian Di Pietrantonio, Christopher Harris, Maciej Cytowski, "Energy-based Accounting Model for Heterogeneous Supercomputers"

User Support Documentation

Changes in Supercomputing Services for 2023