Excerpt |
---|
Singularity is a container platform: it lets you create and run containers that package up pieces of software in a way that is portable and reproducible. With a few basic commands, you can set up a workflow to run on Pawsey systems using Singularity. This page introduces how to get or build a Singularity container image and run that image on HPC systems. |
...
Column | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|
Prerequisites
Familiarity with:
Getting container images and initialising Singularity
Check container availability and load the module
Singularity is installed on most Pawsey systems. Use module
commands to check availability and the version installed:
$ module avail singularity
and then load the module:
$ module load singularity/3.8.6-mpi
for applications that need mpi.
Or, for applications that do not need mpi (like for many of the bioinformatics containers):
$ module load singularity/3.8.6-nompi
To avoid errors when downloading and running containers, run the sequence of commands in the following terminal display:
...
width | 900px |
---|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
$ mkdir -p /software/projects/<project-id>/<user-name>/.singularity
$ chown -hR $USER:$PAWSEY_PROJECT /software/projects/<project-id>/<user-name>/.singularity
$ find /software/projects/<project-id>/<user-name>/.singularity -type d -exec chmod g+s {} \; |
Pull or build a container image
To provide the image that you want to run, either pull an existing container image or build a container image.
Pull containers on the compute nodes. This is particularly important for larger images because the compute nodes will perform better than the shared login nodes.
If you are developing a container, submit a Slurm interactive job allocation for a longer period of time than normally required to accommodate the download time needed for the container image. For example, to ask for 4 hours:
salloc -n 1 -t 4:00:00 -I
Pull an existing image from a container library
You can pull existing containers from a suitable registry such as Docker Hub, Biocontainers, RedHat Quay or Sylabs Container Library.
To import Docker images from, for example, Docker Hub, you can use the singularity pull
command. As Docker images are written in layers, Singularity pulls the layers instead of just downloading the image, then combines them into a Singularity SIF format container.
$ singularity pull --dir $MYSCRATCH/singularity/myRepository docker://user/image:tag
In this example command:
The --dir
flag specifies the image to be downloaded to a locationdocker://
indicates that you're pulling from the Docker Hub registryuser
is the hub userimage
is the image or repository name you're pullingtag
is the Docker Hub tag that identifies which image to pull
Build a container image
To build a container image, we recommend using Docker, either on a local laptop or workstation or on a cloud virtual machine. For example, the Pawsey Nimbus Cloud has Ubuntu installations that come with both Singularity and Docker pre-installed.
Docker is recommended for:
- Compatibility, portability and shareability: Docker images can be run by any container runtime, while Singularity images can only be run by Singularity.
- Ease of development: layer caching in Docker may significantly speed up the process of performing repeated image builds. In addition, Docker allows writing in containers by default, allowing for tests on the fly.
- Community adoption: community experience and know-how in writing good image recipes focuses on Docker and Dockerfiles.
Information on Dockerfile syntax can be found at Dockerfile reference (external site).
...
Note |
---|
The following commands are meant to be run on a local computer or a cloud virtual machine. They cannot be run on Pawsey systems. |
Once you've written a Dockerfile, you can use it to build a container image.
$ sudo docker build -t image:tag .
If you have Singularity installed in the same machine, you can convert the Docker image into the Singularity SIF format.
$ singularity pull image_tag.sif docker-daemon:image:tag
Then this SIF file can be transferred to Pawsey systems.
Best practices for building and maintaining images
...
- Minimize image size
- Each distinct instruction (such as
RUN
) in the Dockerfile generates another layer in the container, increasing its size To minimize image size, use multi-line commands, and clean up package manager caches.
- Each distinct instruction (such as
Avoid software bloat
- Only install the software you need for a given application into a container.
- Make containers modular
- Creating giant, monolithic containers with every possible application you could need is bad practice. It increases image size, reduces performance, and increases complexity. Containers should only contain a few applications (ideally only one) that you'll use. You can chain together workflows that use multiple containers, meaning if you need to change a particular portion you only need to update a single, small container.
There are websites which provide detailed instructions for writing good Docker recipes, such as Best practices for writing Dockerfiles (external site). A simple snippet is provided in listing 1:
...
width | 900px |
---|
...
language | bash |
---|---|
theme | Emacs |
title | Listing 1. Docker file snippet highlighting best practicies |
...
Versions installed in Pawsey systems
To check the current installed versions, use the module avail
command (current versions may be different from content shown here):
Column | |||||||||
---|---|---|---|---|---|---|---|---|---|
| |||||||||
|
Different "flavours" of singularity are identified by the suffix beyond the version number. A detailed description of the different flavours is provided in the sections below.
Getting container images and initialising Singularity
Check container availability and load the module
Singularity is installed on most Pawsey systems. Use module
commands to check availability and the version installed:
$ module avail singularity
and then load the singularity module (for applications that do not need mpi, like for many of the bioinformatics containers):
$ module load singularity/4.1.0-nompi
Or, for applications that need mpi.
$ module load singularity/4.1.0-mpi
To avoid errors when downloading and running containers, run the sequence of commands in the following terminal display:
Column | |||||||||
---|---|---|---|---|---|---|---|---|---|
| |||||||||
|
Pull or build a container image
To provide the image that you want to run, either pull an existing container image or build a container image.
Pull containers on the compute nodes. This is particularly important for larger images because the compute nodes will perform better than the shared login nodes.
If you are developing a container, submit a Slurm interactive job allocation for a longer period of time than normally required to accommodate the download time needed for the container image. For example, to ask for 4 hours:
salloc -n 1 -t 4:00:00 -I
Pull an existing image from a container library
You can pull existing containers from a suitable registry such as Docker Hub, Biocontainers, RedHat Quay or Sylabs Container Library. For most users, this will be the most common way you will use containers. It's a good idea to check what containers are already available before deciding to build your own container.
To import Docker images from, for example, Docker Hub, you can use the singularity pull
command. As Docker images are written in layers, Singularity pulls the layers instead of just downloading the image, then combines them into a Singularity SIF format container.
$ singularity pull --dir $MYSOFTWARE/singularity/myRepository docker://user/image:tag
In this example command:
The --dir
flag specifies the image to be downloaded to a locationdocker://
indicates that you're pulling from the Docker Hub registryuser
is the hub userimage
is the image or repository name you're pullingtag
is the Docker Hub tag that identifies which image to pull
Build a container image
To build a container image, we recommend using Docker, either on a local laptop or workstation or on a cloud virtual machine. For example, the Pawsey Nimbus Cloud has Ubuntu installations that come with both Singularity and Docker pre-installed. You cannot build a container image on Setonix because you will not have admin/sudo privileges.
Docker is recommended for:
- Compatibility, portability and shareability: Docker images can be run by any container engine, while Singularity images can only be run by Singularity.
- Ease of development: layer caching in Docker may significantly speed up the process of performing repeated image builds. In addition, Docker allows writing in containers by default, allowing for tests on the fly.
- Community adoption: community experience and know-how in writing good image recipes focuses on Docker and Dockerfiles.
Information on Dockerfile syntax can be found at Dockerfile reference (external site).
Column | ||
---|---|---|
|
Once you've written a Dockerfile, you can use it to build a container image.
$ sudo docker build -t image:tag .
If you have Singularity installed in the same machine, you can convert the Docker image into the Singularity SIF format.
$ singularity pull image_tag.sif docker-daemon:image:tag
Then this SIF file can be transferred to Pawsey systems.
Best practices for building and maintaining images
Building images Anchor buildtips buildtips
buildtips | |
buildtips |
- Minimize image size
- Each distinct instruction (such as
RUN, CMD, etc
) in the Dockerfile generates another layer in the container, increasing its size To minimize image size, use multi-line commands, and clean up package manager caches.
- Each distinct instruction (such as
Avoid software bloat
- Only install the software you need for a given application into a container.
- Make containers modular
- Creating giant, monolithic containers with every possible application you could need is bad practice. It increases image size, reduces performance, and increases complexity. Containers should only contain a few applications (ideally only one) that you'll use. You can chain together workflows that use multiple containers, meaning if you need to change a particular portion you only need to update a single, small container.
There are websites which provide detailed instructions for writing good Docker recipes, such as Best practices for writing Dockerfiles (external site). We also have some base images and specific examples listed on our Pawsey GitHub containers page (external page).
A simple snippet is provided in listing 1:
Column | |||||||||
---|---|---|---|---|---|---|---|---|---|
| |||||||||
|
Managing your Singularity images
Unlike Docker containers, Singularity containers can be managed as simple files. We recommend that projects keep their Singularity containers in a small number of specific directories. For example, each user might store all of their own Singularity container .sif
files in a repository directory such as $MYSCRATCH/singularity/myRepository
. For containers that will be used by several users in the group, we recommend that the repository be maintained as a shared directory, such as /scratch/$PAWSEY_PROJECT/singularity/groupRepository
.
When pulling Singularity images, many files and a copy of the images themselves are saved in the cache. Singularity modules at Pawsey define the cache location as $MYSCRATCH/.singularity/cache
. This is to avoid problems with the restricted quota of /home
, which is the default Singularity cache location.
To see all of the copies of the images that currently exist in the cache, use the singularity cache list
command.
...
width | 900px |
---|
...
language | bash |
---|---|
theme | DJango |
title | Terminal 2. Listing Singularity images in the cache |
...
their own Singularity container .sif
files in a repository directory such as $MYSOFTWARE/singularity/myRepository
. For containers that will be used by several users in the group, we recommend that the repository be maintained as a shared directory, such as /scratch/$PAWSEY_PROJECT/singularity/groupRepository
.
When pulling Singularity images, many files and a copy of the images themselves are saved in the cache. Singularity modules at Pawsey define the cache location as $MYSCRATCH/.singularity/cache
. This is to avoid problems with the restricted quota of /home
, which is the default Singularity cache location.
To see all of the copies of the images that currently exist in the cache, use the singularity cache list
command.
Column | |||||||||
---|---|---|---|---|---|---|---|---|---|
| |||||||||
|
When you have finished building or pulling your containers, clean the cache. To wipe everything use the -f
flag:
$ singularity cache clean -f
Running jobs with Singularity
Job scripts require minimal modifications to run within a Singularity container. All that is needed is the singularity exec
statement followed by the image name and then the name of the command to be run. Listing 2 shows an example script:
...
width | 900px |
---|
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1840M
#SBATCH --time=00:10:00
#SBATCH --partition=work
# load the singularity module which provides
# the executable and also sets some environment variables
# (In this case, as no mpi is needed:)
module load singularity/3.8.6-nompi
# define some useful environment variables for the container
# this would need to be updated for a users workflow
export myRepository=$MYSCRATCH/singularity/myRepository
export containerImage=$myRepository/image.sif
# define command to be run. In this example we use ls
export mycommand=ls
# run the container with
srun -N 1 -n 1 -c 1 singularity exec ${containerImage} ${mycommand} |
Then submit the script to Slurm as follows:
Column |
---|
|
Bind mounting host directories
The Singularity configuration at Pawsey takes care of always bind mounting the scratch filesystem for you. You can mount additional host directories to the container with the following syntax:
...
|
When you have finished building or pulling your containers, clean the cache. To wipe everything use the -f
flag:
$ singularity cache clean -f
Running jobs with Singularity
Job scripts require minimal modifications to run within a Singularity container. All that is needed is the singularity exec
statement followed by the image name and then the name of the command to be run. Listing 2 shows an example script:
Column | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||
|
Then submit the script to Slurm as follows:
Column |
---|
|
Bind mounting host directories
The Singularity configuration at Pawsey takes care of always bind mounting the scratch filesystem for you. You can mount additional host directories to the container with the following syntax:
$ singularity exec -B /path/to/host/directory:/path/in/container <image name> <command>
Notes:
singularity exec
allows you to execute the container with a specific command that is placed at the end of the string.-B
is the flag for bind mounting the directory to the container.- You can either remove
:/path/in/container
or use:/home
if you do not know the path in the container that you require to run the command from.
Column | |||||
---|---|---|---|---|---|
|
Sample use cases
We discuss several common use cases for containers that require some care. Each example shown below highlights the use of particular container features.
Running Python and R
For Singularity containers that have Python or R built-in, use the flag -e
(clean environment) to run the container with an isolated shell environment. This is because both Python and R make extensive use of environment variables and not using a fresh environment can pollute the container environment with pre-existing variables. If you need to read or write from a local directory, you may use the -e
flag in conjunction with the -B
flag.
$ singularity run -e docker://rocker/tidyverse
$ singularity run -B /path/to/host/directory:/path/in/container <image name> <command>
Notes:
singularity exec
allows you to execute the container with a specific command that is placed at the end of the string.-B
is the flag for bind mounting the directory to the container.- You can either remove
:/path/in/container
or use:/home
if you do not know the path in the container that you require to run the command from.
...
Tip | ||
---|---|---|
| ||
A number of programs assume the existence of
|
Sample use cases
We discuss several common use cases for containers that require some care. Each example shown below highlights the use of particular container features.
Running Python and R
For Singularity containers that have Python or R built-in, use the flag -e
(clean environment) to run the container with an isolated shell environment. This is because both Python and R make extensive use of environment variables and not using a fresh environment can pollute the container environment with pre-existing variables. If you need to read or write from a local directory, you may use the -e
flag in conjunction with the -B
flag.
$ singularity run -e docker://rocker/tidyverse
$ singularity run -B /path/to/host/directory:/path/in/container,/path/to/fake/home:${HOME} -e docker://rocker/tidyverse
There can be specific cases where isolating the shell environment is not feasible, for instance if you're running MPI+Python code, which needs to access scheduler environment variables. Here, a possible workaround is to unset all Python-related variables in the host shell environment and then proceed to execute the container as usual.
$ unset $( env | grep ^PYTHON | cut -d = -f 1 | xargs )
$ srun singularity run docker://python:3.8 my_script.py
Using GPUs
Singularity allows users to make use of GPUs within their containers, by adding the runtime flag --nv
(enable NVIDIA support).
Listing 3 shows an example of running Gromacs, a popular molecular dynamics package, among the ones that have been optimised to run on GPUs through NVIDIA containers:
...
width | 900px |
---|
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
#!/bin/bash -l
#SBATCH --job-name=gpu
#SBATCH --gres=gpu:1
#SBATCH --ntasks=1
#SBATCH --time=01:00:00
# Load Singularity
module load singularity/3.8.6
# Define the container to use
export myRepository=$MYSCRATCH/singularity/myRepository
export containerImage=$myRepository/gromacs_2018.2.sif
# Run Gromacs preliminary step with container
srun singularity exec --nv $containerImage gmx grompp -f pme.mdp
# Run Gromacs MD with container
srun singularity exec --nv $containerImage \
gmx mdrun -ntmpi 1 -nb gpu -pin on -v \
-noconfout -nsteps 5000 -s topol.tpr -ntomp 1 |
The bash script can then be submitted to Slurm as follows:
$ sbatch --account=<your-pawsey-project> --partition=gpuq gpu.sh
Using MPI
MPI applications can be run within Singularity containers. There are two requirements to do so:
- A host MPI installation is required to spawn MPI processes. All Pawsey systems have installed at least one MPICH Application Binary Interface (ABI) compatible implementation; non-Cray clusters also have OpenMPI.
- In the container, an ABI-compatible MPI installation is required to compile the application. Pawsey maintains MPI base images on both DockerHub and RedHat Quay.
Below is an example of a SLURM batch script for using OpenFoam for Computational Fluid Dynamics simulations, which has been built from the MPICH pawsey/mpich-base container and adding the compilation of OpenFOAM.
Use the following script to run the simpleFoam
parallel solver (it is assumed that usual preprocessing of the OpenFOAM case has been already performed):
...
width | 900px |
---|
...
language | bash |
---|---|
theme | Emacs |
title | Listing 4. Running an MPI application within a Singularity container |
collapse | true |
...
,/path/to/fake/home:${HOME} -e docker://rocker/tidyverse
There can be specific cases where isolating the shell environment is not feasible, for instance if you're running MPI+Python code, which needs to access scheduler environment variables. Here, a possible workaround is to unset all Python-related variables in the host shell environment and then proceed to execute the container as usual.
$ unset $( env | grep ^PYTHON | cut -d = -f 1 | xargs )
$ srun singularity run docker://python:3.8 my_script.py
Using GPUs
Singularity allows users to make use of GPUs within their containers, for both NVIDIA and AMD GPUs. Nimbus uses NVIDIA GPUs, while Setonix uses AMD GPUs. To enable NVIDIA support, add the runtime flag --nv
. To use AMD GPUs, add the --rocm
flag to your singularity command instead of --nv
.
Listing 3 shows an example of running Gromacs on GPUs through rocm capable container:
Column | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||
|
The bash script can then be submitted to Slurm as follows:
$ sbatch --account=<your-pawsey-project>-gpu --partition=gpu gpu.sh
For more information on how to use GPU partitions on Setonix see: Example Slurm Batch Scripts for Setonix on GPU Compute Nodes.
Using MPI
MPI applications can be run within Singularity containers. There are two requirements to do so:
- A host MPI installation is required to spawn MPI processes. All Pawsey systems have installed at least one MPICH Application Binary Interface (ABI) compatible implementation; non-Cray clusters also have OpenMPI.
- In the container, an ABI-compatible MPI installation is required to compile the application. Pawsey maintains MPI base images on both DockerHub and RedHat Quay.
Below is an example of a SLURM batch script for using OpenFoam for Computational Fluid Dynamics simulations, which has been built from the MPICH pawsey/mpich-base container and adding the compilation of OpenFOAM.
Use the following script to run the simpleFoam
parallel solver (it is assumed that usual preprocessing of the OpenFOAM case has been already performed):
Column | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||
Notes:
|
Singularity on Pawsey Systems
Column | ||
---|---|---|
Warning | | |
|
Singularity flavours on Pawsey Systems
Depending on the cluster, up to three distinct different Singularity modules may be available:
Cluster | singularity/VV-mpi | singularity/VV-mpi -gpu | singularity/VV-nompi |
|
| Setonix (HPE Cray Ex) | X |
---|---|---|---|---|---|
Topaz (GPU) | X | X | X | ||
Garrawarla (GPU) | X | X | X | ||
Setonix (HPE Cray Ex) | yes | yes | yes | no | no |
These modules differ on the flavour of the MPI library they bind mount in the containers at runtime, and on whether or not they also bind mount the required libraries for CUDAGPU-aware MPI:
singularity/VV-mpi
: Cray MPI (Setonix) or Intel MPI (othersother clusters). All ABI compatible with MPICHsingularity/VV-mpi-gpu
: Cray MPI (Setonix) or Intel MPI (other clusters). All ABI compatible with MPICHwith MPICH. With GPU-aware MPI.singularity/VV-openmpi
: OpenMPIsingularity-openmpi-gpu
: OpenMPI built with CUDA support and any other libraries required by CUDA-aware MPI (for example:gdrcopy
nompi:
For applications that do not require mpi communications (commonly Bioinformatics applications)singularity/VV-nohost:
For applications that require total isolation from host environment (commonly Bioinformatics applications)
Features of the modules
These singularity
modules set important environment variables to provide a smoother and more efficient user experience. Modules set several key environment variables
...
To ensure that container images are portability, Pawesy Pawsey provided containers keep host libraries to a minimum. The only case currently supported by Pawsey is mounting of interconnect/MPI libraries, to maximise performance of inter-node communication for MPI and CUDA-aware MPI enabled applications.
Related pages
External links
- Singularity Quick StartUser Guide
- Dockerfile reference
- For specific details about containerised OpenFOAM tools and usage, refer to the OpenFOAM documentation.