Skip to end of banner
Go to start of banner

Running Jobs on Setonix

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 56 Next »


Setonix uses the Slurm workload manager to schedule user programs for execution. To learn the generalities of using Slurm to schedule programs in supercomputers, visit the Job Scheduling page. In addition, please read the following subsections discuss the peculiarities of running jobs on Setonix, together with the Example Slurm Batch Scripts for Setonix on CPU Compute Nodes and Example Slurm Batch Scripts for Setonix on GPU Compute Nodes.


Important

It is highly recommended that you specify values for the --nodes, --ntasks, --cpus-per-task and --time options that are optimal for the job and for the system on which it will run. Also, use --mem if the job will not use all the resources in the node: shared access; or --exclusive for allocation of all resources in the requested nodes: exclusive access.

Overview

Each compute node of Setonix share its resources by default to run multiple jobs on the node at the same time, submitted by many users from the same or different projects. We call this configuration shared access and, as mentioned, is the default for Setonix nodes. Nevertheless, users can use slurm options to override the default and explicitly request for exclusive access to the requested nodes.

Nodes are grouped in partitions. Each partition is characterised by a particular configuration of its resources and it is intended for a particular workload or stage of the scientific workflow development. Table 1 shows the list of partitions present on Setonix and their available resources per node.

Each job submitted to the scheduler gets assigned a Quality of Service (QoS) level which determines the priority of the job with respect to the others in the queue. Usually, the default normal QoS applies. Users can boost the priority of their jobs up to 10% of their allocations, using the high QoS, in the following way:

$ sbatch --qos=high myscript.sh

Each project has an allocation for a number of service units (SUs) in a year, which is broken into quarters. Jobs submitted under a project will subtract SUs from the project's allocation. A project that has entirely consumed its SUs for a given quarter of the year will run its jobs in low priority mode for that time period. If a project's SU consumption (for a given quarter) hits the 150% usage mark with respect to its granted allocation, no further jobs will be able to run under the project.


Submitting jobs to the GPU partitions

You will need to use a different project code for the --account/-A option. More specifically, it is your project code followed by the -gpu suffix. For instance, if your project code is project1234, then you will have to use project1234-gpu.

Table 1. Slurm partitions on Setonix

Name

N. Nodes

Cores per nodeAvailable node-RAM for jobsGPU chiplets per nodeTypes of jobs supportedMax Number of Nodes per JobMax Wall timeMax Number of Concurrent Jobs per UserMax Number of Jobs Submitted per User
work13762x 64230 GBn/aSupports CPU-based production jobs.-24h2561024
long82x 64230 GBn/aLong-running CPU-based production jobs.196h496
highmem82x 64980 GBn/aSupports CPU-based production jobs that require a large amount of memory.196h296
debug82x 64230 GBn/aExclusive for development and debugging of CPU code and workflows.41h14
gpu1241x 64230 GB8Supports GPU-based production jobs.-24h--
gpu-highmem381x 64460 GB8Supports GPU-based production jobs requiring large amount of host memory.-24h--
gpu-dev201x 64230 GB8Exclusive for development and debugging of GPU code and workflows.-4h--
copy71x 32115 GBn/aCopy of large data to and from the supercomputer's filesystems.-48h42048
askaprt1802x 64230 GBn/aDedicated to the ASKAP project (similar to work partition)-24h81928192
casda11x 32115 GBn/aDedicated to the CASDA project (similar to copy partition)-24h3040
mwa102x 64230 GBn/aDedicated to the MWA projects (similar to work partition)-24h10002000
mwa-asvo102x 64230 GBn/aDedicated to the MWA projects (similar to work partition)-24h10002000
mwa-gpu101x 64230 GB8Dedicated to the MWA projects (similar to gpu partition)-24h10002000
mwa-asvocopy21x 32115 GBn/aDedicated to the MWA projects (similar to copy partition)-48h321000

Table 2. Quality of Service levels applicable to a Slurm job running on Setonix

NamePriority LevelDescription
lowest0Reserved for particular cases.
low3000

Priority for jobs past the 100% allocation usage.

normal10000The default priority for production jobs.
high14000Priority boost available to all projects for a fraction (10%) of their allocation.
highest20000Assigned to jobs that are of critical interest (e.g. project part of the national response to an emergency).
exhausted0QoS for jobs for projects that have consumed more than 150% of their allocation.

Debug and Development Partitions Policy

To ensure the debug and development partitions are available for use by Pawsey researchers, they are strictly reserved for the following activities:
  • Code porting
  • Code debugging
  • Code development
  • Job script/workflow management script porting, debugging and/or development

These partitions must not be used for the following activities:

  • Production runs (i.e., jobs that are intended to generate final results or data for publication, reporting, or use in further analysis)
  • Preparatory or test runs, including but not limited to:
    • Warm-up/generation of initial conditions for simulations
    • Testing configurations, searching for optimal/stabilitiy parameters, or setting up simulations, even if the results will not be used directly.
    • Running simulations or experiments to determine production parameters for AI/ML model training (e.g., hyperparameter tuning, configuration testing, validation of stability under different settings).
    • Testing code or scripts in ways that mimic production workloads, such as large-scale simulations or model training, that are not explicitly part of the development or debugging process.

Note: This restriction applies regardless of the execution time of the jobs. For instance, jobs that involve testing for numerical stability, parameter optimization, or early-stage simulations should not be conducted on the debug/development partitions, even if the run times are under the partition's walltime limit.

Job Queue Limits

Users can check the limits on the maximum number of jobs that users can run at a time (i.e., MaxJobs) and the maximum number of jobs that can be submitted (i.e., MaxSubmitJobs) for each partition on Setonix using the command:

$ sacctmgr show associations user=$USER cluster=setonix

Additional constraints are imposed on projects that have overused their quarterly allocation.

Executing large jobs

When executing large, multinode jobs on Setonix, the use of the --exclusive option in the batch script is recommended. The addition will result in better resource utilisation within each node assigned to the job.

Subpages in this section:

Related pages

  • No labels