SHPC (Singularity Registry HPC)

SHPC (Singularity Registry HPC)

SHPC is a utility that allows the installation of software containers in the form of container modules.

On this page:

Prerequisites

Familiarity with:

What is SHPC?

SHPC allows the installation of software containers in the form of so-called container modules, for transparent usage of containerised applications. An automated process generates a system module for an application, hiding the specificities of the Singularity syntax behind shell functions that take the same name as the corresponding executables.

For containerised applications that are already available in the SHPC registry, installing and using them via SHPC is much simpler than using Singularity itself. For applications that are not yet in the registry, writing a custom container recipe may still be faster than learning how to use Singularity.

SHPC at Pawsey

SHPC has been configured by Pawsey staff to work out-of-the-box; the following aspects have been set up:

  • Directory trees for installed containers and modules

  • Default registry for installation recipes

  • Use of Singularity as the container runtime

  • Naming convention and features of generated module files

  • Configuration for MPI and GPU containers

SHPC is used by Pawsey staff to deploy some of the available scientific software, in particular bioinformatics applications.

Versions installed in Pawsey systems

To check the current installed versions, use the module avail command (current versions may be different from content shown here):

Terminal 1. Checking for installed versions
$ module avail shpc ------------------------------------ /software/setonix/2024.05/pawsey/modules ------------------------------------- shpc/0.1.28

Using SHPC

Installing a container for a software included in the SHPC registry requires no knowledge of containers and Singularity: all you need are the software name and version.

The key commands of SHPC are show and install; let's see them in action with an example. Suppose we want to install the bioinformatics package BWA. We can use the shpc show command to browse the SHPC registry of available containers:

Terminal 2. Example SHPC show command
$ module load shpc/<VERSION> # load SHPC module with the correct version $ shpc show -f bwa # search for a package in SHPC registry (string search) quay.io/biocontainers/bwa ghcr.io/autamus/bwa $ shpc show quay.io/biocontainers/bwa # inspect specific container recipe docker: biocontainers/bwa url: https://biocontainers.pro/tools/bwa maintainer: '@vsoch' description: shpc-registry automated BioContainers addition for bwa latest: 0.7.17--h7132678_10: sha256:f9063141d8c5da87da76392b3a152b927b2913d47373f1874d76f14634b3f684 tags: 0.7.17--h7132678_9: sha256:07822e4293a8c59755b295c448b9541db6c9bdbfdedb010bdbdcc1e1e935370f 0.7.17--h7132678_10: sha256:f9063141d8c5da87da76392b3a152b927b2913d47373f1874d76f14634b3f684 docker: quay.io/biocontainers/bwa aliases: bwa: /usr/local/bin/bwa

The information of interest in this output is the list of available versions (or tags), in this case: 0.7.15 and v0.7.17_cv1. Let's install the former:

Terminal 3. Example SHPC install command
$ shpc install quay.io/biocontainers/bwa:0.7.17--h7132678_10 singularity pull --name /software/projects/pawsey0001/buser/setonix/2024.05/containers/sif/quay.io/biocontainers/bwa/0.7.17--h7132678_10/quay.io-biocontainers-bwa-0.7.17--h7132678_10-sha256:f9063141d8c5da87da76392b3a152b927b2913d47373f1874d76f14634b3f684.sif docker://quay.io/biocontainers/bwa@sha256:f9063141d8c5da87da76392b3a152b927b2913d47373f1874d76f14634b3f684 INFO: Converting OCI blobs to SIF format INFO: Starting build... [..] INFO: Creating SIF file... Module quay.io/biocontainers/bwa:0.7.17--h7132678_10 was created. Creating link $module_base/quay.io/biocontainers/bwa/0.7.17--h7132678_10/module.lua -> $views_base/modules/bwa/0.7.17--h7132678_10.lua

That's it!

By default SHPC downloads containers under:

/software/projects/<project-id>/<user-name>/setonix/containers/sif/

and creates modulefiles under:

/software/projects/<project-id>/<user-name>/setonix/containers/modules/

You are able to use module availmodule load, and module unload: (as these are system modules, note the slash "/" for the version, instead of the colon ":" above for the tags):

Terminal 4. Example SHPC module load
$ module avail bwa # search module ----------- /software/projects/projectcode/rsrchr/setonix/2024.05/containers/views/modules ------------- bwa/0.7.17--h7132678_10 $ module load biocontainers/bwa/0.7.15/module # load module $ bwa # test command Program: bwa (alignment via Burrows-Wheeler transformation) Version: 0.7.17-r1188 Contact: Heng Li <lh3@sanger.ac.uk> Usage: bwa <command> [options] Command: index index sequences in the FASTA format mem BWA-MEM algorithm fastmap identify super-maximal exact matches pemerge merge overlapping paired ends (EXPERIMENTAL) aln gapped/ungapped alignment samse generate alignment (single ended) sampe generate alignment (paired ended) bwasw BWA-SW for long queries shm manage indices in shared memory fa2pac convert FASTA to PAC format pac2bwt generate BWT from PAC pac2bwtgen alternative algorithm for generating BWT bwtupdate update .bwt to the new format bwt2sa generate SA from BWT and Occ Note: To use BWA, you need to first index the genome with `bwa index'. There are three alignment algorithms in BWA: `mem', `bwasw', and `aln/samse/sampe'. If you are not sure which to use, try `bwa mem' first. Please `man ./bwa.1' for the manual.

The full list of SHPC commands can be shown by using one of the help commands:

$ shpc -h
$ shpc <subcommand> -h

Writing an SHPC container recipe

What if a software container is not in the SHPC registry? In this case, you can either write your own container recipe (see terminal 5), or email the Pawsey Helpdesk for help.

Suppose you want to install the bioinformatics tool Velvet. For the sake of this example, we know already that there's a container available for Velvet at quay.io/biocontainers/velvet (external site).

Terminal 5. Velvet not on SHPC registry
$ module load shpc/<VERSION> # load SHPC module $ shpc show -f velvet $

As you can see from the empty output, there's no pre-existing entry in the SHPC Container Registry.

Let's see how to create one; in practice, we need to create a YAML container recipe inside the registry tree of SHPC. First, let's get the location of the registry, and then create an appropriate directory structure using the known container repo quay.io/biocontainers/velvet that was postulated above.

Terminal 6. Create SHPC container recipe for Velvet
$ shpc config get registry # get registry location registry /software/projects/projectcode/rsrchr/shpc/registry # create directory tree for desired Velvet container recipe $ mkdir -p /software/projects/projectcode/rsrchr/shpc/registry/quay.io/biocontainers/velvet # create a new YAML container recipe in the new path (using vi as text editor here) $ vi /software/projects/projectcode/rsrchr/shpc/registry/quay.io/biocontainers/velvet/container.yaml

Let's see how a possible recipe for Velvet might look:

Listing 1. Velvet container YAML recipe
docker: quay.io/biocontainers/velvet latest: "1.2.10--h5bf99c6_4": "sha256:7fc2606a1431883dcd0acf830abcfeddb975677733d110a085da0f07782f5a27" tags: "1.2.10--h5bf99c6_4": "sha256:7fc2606a1431883dcd0acf830abcfeddb975677733d110a085da0f07782f5a27" "1.2.10--hed695b0_3": "sha256:b17fd98d802c1e78dde5fd2c5431efc1969db35a279f3a5ca7afcb46efc66e4a" maintainer: "@marcodelapierre" # these are optional description: "Velvet is a sequence assembler for short reads." url: https://quay.io/repository/biocontainers/velvet aliases: velvetg: /usr/local/bin/velvetg velveth: /usr/local/bin/velveth

Let's comment on the key components of this YAML file:

  • docker is the repository path for the container, without version tags

  • tags is a list of container tags (versions) with the corresponding SHA message digest (shasum); these need to be manually collected from the repository website, in this case https://quay.io/repository/biocontainers/velvet?tab=tags 

  • latest is a copy-paste of the tag from above, to be used as "latest" version

  • maintainer is the Github username of the creator (required to contribute the recipe back to the Github repository of SHPC; put any name if you don't have one)

  • aliases is a list of command names that will be made available by the SHPC module, with the corresponding commands from inside the container; these need to be manually provided, either by reading through the documentation of the package, or by downloading and inspecting the container

Does this recipe work? Let's give it a go!

Terminal 7. Test new SHPC container recipe for Velvet
$ shpc show -f velvet # can SHPC locate the new recipe? yes! quay.io/biocontainers/velvet $ shpc install quay.io/biocontainers/velvet:1.2.10--h5bf99c6_4 # installing Velvet singularity pull --name /software/projectcode/rsrchr/shpc/containers/quay.io/biocontainers/velvet/1.2.10--h5bf99c6_4/quay.io-biocontainers-velvet-1.2.10--h5bf99c6_4-sha256:7fc2606a1431883dcd0acf830abcfeddb975677733d110a085da0f07782f5a27.sif docker://quay.io/biocontainers/velvet@sha256:7fc2606a1431883dcd0acf830abcfeddb975677733d110a085da0f07782f5a27 INFO: Converting OCI blobs to SIF format INFO: Starting build... Getting image source signatures [..] INFO: Creating SIF file... /software/projects/projectcode/rsrchr/shpc/containers/quay.io/biocontainers/velvet/1.2.10--h5bf99c6_4/quay.io-biocontainers-velvet-1.2.10--h5bf99c6_4-sha256:7fc2606a1431883dcd0acf830abcfeddb975677733d110a085da0f07782f5a27.sif Module quay.io/biocontainers/velvet:1.2.10--h5bf99c6_4 was created. $ module load quay.io/biocontainers/velvet/1.2.10--h5bf99c6_4/module # loading module $ velvetg --help # testing a command Usage: ./velvetg directory [options] directory : working directory name Standard options: -cov_cutoff <floating-point|auto> : removal of low coverage nodes AFTER tour bus or allow the system to infer it (default: no removal) -ins_length <integer> : expected distance between two paired end reads (default: no read pairing) -read_trkg <yes|no> : tracking of short read positions in assembly (default: no tracking) -min_contig_lgth <integer> : minimum contig length exported to contigs.fa file (default: hash length * 2) -amos_file <yes|no> : export assembly to AMOS file (default: no export) -exp_cov <floating point|auto> : expected coverage of unique regions or allow the system to infer it (default: no long or paired-end read resolution) -long_cov_cutoff <floating-point>: removal of nodes with low long-read coverage AFTER tour bus (default: no removal) Advanced options: -ins_length* <integer> : expected distance between two paired-end reads in the respective short-read dataset (default: no read pairing) -ins_length_long <integer> : expected distance between two long paired-end reads (default: no read pairing) -ins_length*_sd <integer> : est. standard deviation of respective dataset (default: 10% of corresponding length) [replace '*' by nothing, '2' or '_long' as necessary] -scaffolding <yes|no> : scaffolding of contigs used paired end information (default: on) -max_branch_length <integer> : maximum length in base pair of bubble (default: 100) -max_divergence <floating-point>: maximum divergence rate between two branches in a bubble (default: 0.2) -max_gap_count <integer> : maximum number of gaps allowed in the alignment of the two branches of a bubble (default: 3) -min_pair_count <integer> : minimum number of paired end connections to justify the scaffolding of two long contigs (default: 5) -max_coverage <floating point> : removal of high coverage nodes AFTER tour bus (default: no removal) -coverage_mask <int> : minimum coverage required for confident regions of contigs (default: 1) -long_mult_cutoff <int> : minimum number of long reads required to merge contigs (default: 2) -unused_reads <yes|no> : export unused reads in UnusedReads.fa file (default: no) -alignments <yes|no> : export a summary of contig alignment to the reference sequences (default: no) -exportFiltered <yes|no> : export the long nodes which were eliminated by the coverage filters (default: no) -clean <yes|no> : remove all the intermediary files which are useless for recalculation (default : no) -very_clean <yes|no> : remove all the intermediary files (no recalculation possible) (default: no) -paired_exp_fraction <double> : remove all the paired end connections which less than the specified fraction of the expected count (default: 0.1) -shortMatePaired* <yes|no> : for mate-pair libraries, indicate that the library might be contaminated with paired-end reads (default no) -conserveLong <yes|no> : preserve sequences with long reads in them (default no) Output: directory/contigs.fa : fasta file of contigs longer than twice hash length directory/stats.txt : stats file (tab-spaced) useful for determining appropriate coverage cutoff directory/LastGraph : special formatted file with all the information on the final graph directory/velvet_asm.afg : (if requested) AMOS compatible assembly file

Advanced: adding features to your recipe

You can add advanced features to your recipe using Features. The SHPC authors outline this functionality in their documentation.

Currently (as of 11/8/2023), the following features are supported:

Name

Description

Default

Options

Name

Description

Default

Options

gpu

If the container technology supports it, add flags to indicate it uses GPU.

null

nvidia, amd, null

x11

Bind mount ~/.Xauthority or a custom path

null

true (uses default path ~/.Xauthority), false/null (do not enable) or a custom path to an x11 file

home

Specify and bind mount a custom home path

null

custom path for the home, or false/null

Note that you can use the home feature to allow your container module to bind mount a home of your choosing. This might be needed for some containers that expect to access a certain directory. For example, the aws-cli container expects to have read/write access to your ~/.aws  directory. See the following SHPC recipe as an example which would bind mount the user's home directory:

Listing 2. Example SHPC YAML file demonstrating the home feature
docker: quay.io/pawsey/hpc-python url: https://quay.io/repository/pawsey/hpc-python maintainer: '@marcodelapierre' description: Base Python images with popular packages for HPC workflows. latest: '2022.03': sha256:962e7c24302b2dc3946bb22326d0cb4385373113a212231488070aa3e43bd1a1 tags: '2021.09': sha256:c2f3f585a0be711046583c5861199107c94e047545325834d68d81d2582b7a04 2021.09-hdf5mpi: sha256:9d34b5908630e028a6a084891af8b6e65f2626c30e57c06e883f8909850c782b '2022.03': sha256:962e7c24302b2dc3946bb22326d0cb4385373113a212231488070aa3e43bd1a1 2022.03-hdf5mpi: sha256:e9a0db88e98c2388d8731a983ed845b46ce0e2d99d4566802b84142ce21e1c23 aliases: python: /usr/local/bin/python python3: /usr/local/bin/python3 env: PYTHONSTARTUP: '' PYTHONUSERBASE: '' features: home: true

Related pages

External links