Starting from July 24th, 2023, Setonix will include compute nodes with the new Cray Programming Environment (CPE) 23.02. This update to CPE will result in better performance of the supercomputer but will also imply a series of modifications and updates in the whole working environment to users. The purpose of this page is to consolidate in one place all the important points that researchers will need to pay attention, together with the recommended actions or changes to take.
User actions
Every research should follow the following steps to use the new CPE:
- review updated versions of software in the system-wide stack provided by Pawsey and HPE
- reinstall any software managed by the group
- rebuild containers previously built on Pawsey-provided images for multi-node MPI support
- update Slurm batch scripts to reflect updated versions of software
Further details are provided in the following sections.
What is new in CPE 23.02
The Cray Programming Environment (CPE) is the programming environment provided by the vendor on Setonix. It has been updated to version 23.03 and includes newer MPI libraries, newer Cray (14.0.3 -> 15.0.1) and GCC (12.1.0 -> 12.2.0) compilers. The AOCC compiler and programming environment should NOT be used as it is unstable.
Pawsey software stack
The system-wide software stack managed by Pawsey made available by the pawseyenv module is now versioned by deployment date, using the YYYY.MM format. You can choose which deployment to use, bearing in mind that a deployment can be marked as deprecated or unsupported due to breaking changes induced by the vendor updates to the system.
The new version of the software stack will be 2023.07, and it will replace the 2022.11 that was deployed on Setonix up until now. Unfortunately, due to the underlying changes in the Cray Programming Environment, the 2022.11 cannot be kept on Setonix. User installations of software using spack/0.17.0 will have to be re-executed using spack/0.19.0. Software builds with other means will most likely be affected as well, especially if depending on MPI and the Pawsey software stack.
The newly deployed software stack has been upgraded to include the latest version of many applications and libraries such as gromacs, cp2k, lammps.
Spack
Spack has been upgraded from version 0.17.0 to 0.19.0. The new version brings many bug fixes, more supported applications and libraries, and support to build software that runs on AMD GPUs. The ~/.spack
directory, where user customisations for Spack are saved, has been moved and renamed to $MYSOFTWARE/setonix/<DATE_TAG>/.spack_user_config
.
The build cache of Spack has been moved to $MYSCRATCH/setonix not to consume the inode quota on $MYSOFTWARE.
Spack configuration now recognises the ROCm installation on the system and, at the same time, ignores incomplete pre-installed packages that were causing issues when compiling software (e.g. gettext).
User-private and project-wide software stack directories
The user-private and project-wide installations are now versioned by the system-wide software stack deployment date. For instance, user-private software is now located at
$MYSOFTWARE/setonix/<DATE_TAG>
where <DATE_TAG>
is the date of the related software stack deployment in the YYYY.MM
format. When you select which pawseyenv version to use, you select also what user and project installations are visible.
GPU software
The new CPE comes with ROCm 5.2.3, replacing version 5.0.2. In addition, the AMD GPU driver has also been updated and should resolve many of the ROCm issues observed on the system.
GPU-enabled builds of several applications have been added to the stack (or will be in the incoming weeks). These replace the containerised deployments that were previously present on Setonix.
- amber 22
- lammps
- cp2k
- gromacs without gpu-direct
- nekRS
Machine learning frameworks Tensorflow and PyTorch are provided by means of containers and made visible using the module system. Once the respective module is loaded, the python3
interpreter from within the container has access to the Tensorflow or PyTorch Python modules.
Singularity Container Engine
Singularity version is going from 3.8.6 to 3.11.4. Accompanying this update will be newer variants of the singularity module. These modules set different singularity environment variables to ensure certain runtime behaviour. The modules are listed as singularity/3.11.4-*
. This set has
- mpi-gpu: Mounts
/scratch
and/software
filesystems, adds to the library path any externally load modules, and ensures host MPI libraries and GPU MPI libraries are injected into the container. For containerized executables that require MPI and GPU-GPU MPI communication. - mpi: Mounts
/scratch
and/software
filesystems, adds to the library path any externally load modules, and ensures host MPI libraries are injected into the container. For containerized executables that require MPI. - nompi: Mounts
/scratch
and/software
filesystems, adds to the library path any externally load modules. For containerized executables not requiring MPI. - nohost: Mounts
/scratch
and/software
filesystems. For containerized excutables requiring more isolation from the host. - slurm: Mounts
/scratch
and/software
filesystems and also mounts extra directories to enable slurm to be run inside the container and launch jobs on Setonix. For containers running slurm.
Users own software
Manual builds
Researchers need to recompile their own manually built software that has been built previously on Setonix. This is necessary because the CPE has newer versions of various libraries, new paths to libraries (specifically MPICH) to ensure the best possible performance and avoid issues. This process also might require an update to any modules loaded since versions will change.
Conda/Mamba
In general, your software installed using Conda/Mamba should not be affected by the updates. The exception would be if you have installed software using the conda install --use-local
option. The --use-local
option uses local files to do the package installation, rather than the external channels that Conda typically uses (e.g. bioconda of conda-forge). The paths of the local files may have changed, and would need to up rebuilt using the updated paths. In general, the default option is to use the external channels such and conda-forge, so we do not expect this to impact many users.
R
The version of R provided as a module has changed from 4.1.0 to 4.2.1. This may cause issues with your installed R libraries and packages and require you to update your installed versions to be compatible with this newer version of R. We have provided an example to automate the re-installation of your software against the new version of R.
...
width | 900px |
---|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
# Load the old R module version
$ module swap pawseyenv TODO --> Confirm versions
$ module load r/4.1.0
# Open an R session
$ R
# Get a list of your installed packages
> installed_packages <- installed.packages()
> write.csv(installed_packages, file = "installed_packages.csv")
# close the R session
> quit()
# Load the new R version
$ module swap pawseyenv TODO --> Confirm versions
$ module load r/4.2.1
# Open a new R session
$ R
# Load the saved list of installed packages
> installed_packages <- read.csv("installed_packages.csv")
# Get the names of the installed packages
> packages_to_update <- installed_packages[,"Package"]
# Reinstall or update each package
> for (package in packages_to_update) {
install.packages(package, dependencies = TRUE)
} |
Another option
Python virtual environment
Spack installations
Researchers that installed software with spack/0.17.0
will need to load the older software stack to load this particular version of spack and query it to get the previously installed packages. The steps involved are
- Load the old pawsey environment module and older version of spack
- query spack to find all the old builds in your software space. For users in multiple projects, this will require querying each project (see here for how to change projects)
- generate scripts to see if specifications are acceptable and then rebuild with newer spack
An example of the commands to run is provided.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
# swap the pawseyenv module to the previous software stack
module swap pawseyenv pawseyenv/2022.11
# load the older spack version
module load spack/0.17.0
# look for modules within your $MYSOFTWARE installation
hashlist=($(ls -lR ${MYSOFTWARE}/setonix/modules/zen3/| grep .lua | sed "s/.lua//g" | sed "s/-/ /g"| awk '{print $NF}'))
# query spack for these user built packages to get the build information and store it so that it can be used to generate an installation with
# the new spack
echo "#!/bin/bash" > spack.specs.sh
echo "module load spack/0.19.0" >> spack.specs.sh
cp spack.specs.sh spack.install.sh
for h in ${hashlist[@]}
do
echo "spack spec -Il \" >> spack.specs.sh
spec=$(spack find -v /${h})
echo ${spec} >> spack.specs.sh
echo "spack install \" >> spack.install.sh
echo ${spec} >> spack.install.sh
done
# now have to scripts to run with the newer pawsey environment
module swap pawseyenv pawseyenv/2023.07
# check the specs
bash spack.specs.sh
# if you are happy install, otherwise need to iterate on the spec script and the installation script
bash spack.install.sh |
Once you are satisfied with the new builds, please clean up the old builds
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
module swap pawseyenv pawseyenv/2022.11
# load the older spack version
module load spack/0.17.0
# look for modules within your $MYSOFTWARE installation
hashlist=($(ls -lR ${MYSOFTWARE}/setonix/modules/zen3/| grep .lua | sed "s/.lua//g" | sed "s/-/ /g"| awk '{print $NF}'))
for h in ${hashlist[@]}
do
spack uninstall /${h}
done
spack clean -a
|
Containers
The Singularity container platform has been upgraded to version 3.11.4. As before, several configurations are available and need to be chosen depending on your applications. So, for example, for biocontainers that do not use MPI, users should use singularity/3.11.4-nompi
; while containers with MPI applications should use singularity/3.11.4-mpi
.
In regards to the containers themselves, containers that does not make use of MPI should not suffer execution issues with the upgrades to the CPE and Singularity.
But, as the programming environment has changed, containers that make use of MPI may suffer from incompatibilities with the new host MPI libraries. So, containers that make use of MPI may need to be rebuilt based on a Pawsey's new MPICH base image that has been tested against the new CPE. Then, users with their own container with MPI applications will need to update its recipe and rebuild starting from:
FROM quay.io/pawsey/mpich-base:3.4.3_ubuntu20.04
The new base image with MPICH 3.4.3 built from Ubuntu 20.04 does not suffer from incompatibilities with the new host libraries. If interested, the recipe (Dockerfile) for this new base image is publicly available in Pawsey's Git repository: new MPICH-base-image-Dockerfile, and the built Docker image has already been uploaded to Pawsey's registry quay.io/pawsey. Also, a practical example of an updated recipe for the OpenFOAM tool can be found in Pawsey's Git repository: Openfoam-v2212-Dockerfile.
...
Detailed changes, actions and recommendations
Child pages (Children Display) |
---|