Skip to end of banner
Go to start of banner

Nimbus for Bioinformatics

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 21 Next »

This page:

Summary


This page covers information on how to use the new 'Pawsey Bio - Ubuntu 22.04 - 2023-03' image for Nimbus. Instructions on how to choose this image when creating your instance can be found here. This Bio-image is created to cater to bioinformatics users who prefer to have their instances pre-installed with software and tools commonly used in the bioinformatics domain, including over 8000 Biocontainer tools.

If you previously used the now deprecated 'Pawsey Bio - Ubuntu 20.04 - 2021-11' image, most of the instructions on here will still apply.


Pre-installed software


The list of pre-installed software on this image is as follows:

SoftwareInformation
AnsibleAn automation platform that Pawsey uses to automate a number of software deployment
CernVM-FSA read-only file system for accessing files on shared repositories

Docker

A popular container engine
Google Chrome
LmodA modules environment that we use at Pawsey for loading sotware
NextflowA popular workflow manager
PipA Python package installer 
Python3

RStudio

To use RStudio interactively - see Run RStudio Interactively
SingularityA popular container engine that can be used on HPC
Singularity-HPCA container modules installer
SpackA package management tool
X2go

A virtual desktop application - see Setting up a virtual desktop for your instance

Instructions


On this page, we will only cover instructions for how to use CernVM-FS and Singularity-HPC. For instructions for other software listed above, please see the software's original documentation page.

CernVM-FS

CernVM-FS is a read-only file system that was developed by another supercomputing centre (Cern). It allows files such as container tools, reference datasets and other shared resources that are commonly used by many researchers to be accessed, added to, and updated in the one location. At Pawsey, we currently mirror the Biocontainer tools and reference genome datasets that are on Galaxy Project's repositories . Please note that these may not be comprehensive, and this service is not meant to replace your current methods for accessing public datasets.

The Biocontainer tools are in the format of Singularity containers. To use them, you can skip this step and proceed to Singularity-HPC.

To access and view the list of Biocontainer tools:

Note: It may take a minute or two to load the folders. When you have done it once, it will not take as long to show again.

ls /cvmfs/singularity.galaxyproject.org


To access the data files:

ls /cvmfs/data.galaxyproject.org


If you run into any errors with accessing the file system, run the following to re-install it:

sudo apt-get autoremove cvmfs
sudo apt-get purge cvmfs
sudo rm -rf /etc/cvmfs/
git clone https://github.com/PawseySC/Pawsey-CernVM-FS.git
cd Pawsey-CernVM-FS
sudo ./install-cvmfs.sh install

Singularity-HPC

Singularity-HPC (SHPC) is a software for container modules. we have integrated the use of SHPC seamlessly with CernVM-FS. This means that you can easily access and use over 8000 Biocontainers without needing to understand container syntax.

To see the entire list of containers available on the registry, run the following command:

shpc list

To narrow down the list for a particular tool e.g, fastqc:

shpc show -f quay.io/biocontainers/fastqc

To find and use the tool:

$ module avail fastqc

$ module load quay.io/biocontainers/fastqc

$ fastqc --version

To check the list of modules loaded:

$ module list
  • No labels