Managing Files with Singularity Overlays

If you are using a program that produces many (e.g. millions) small files, you may keep hitting the /scratch file count quota. This practice creates a huge workload on the metadata servers of the filesystem, degrading its performance. This guide will show you how to use Singularity persistent file overlays to avoid hitting the quota by hiding the small files from the underlying Lustre filesystem. The overlay looks like a single file to Setonix, but holds all your many small files inside. This can be useful for applications such a Trinity, Maker, and OpenFoam. 

Step-by-step guide

This example will demonstrate how to setup and use the overlay. 

Terminal 1. Example overlay creation
# Load the singularity module. Note that the version number may change over time. 
$ module load singularity/4.1.0-nompi

# Pull your desired container. This is just an example container
$ singularity pull docker://quay.io/biocontainers/maker:2.31.11–pl5262hec0a270_1

# Decide on the max size of the overlay in MB. This example is 200MB. The default is 100MB. 
$ export SIZE="200"

# Decide on the name of your overlay. Here we will call it 'my_overlay'
$ export FILE="my_overlay"

# Use Singularity to create the overlay using the specifications you just set
$ singularity overlay create --size $SIZE $FILE

# Create an output directory for your program's files in the overlay root directory (i.e. / ). 
# Note that if you put the directory somewhere that's not root, you'll just write to Setonix like normal.
$ singularity exec --overlay my_overlay maker.sif mkdir /maker_out_dir

# Then when you run your program from the container, you specify the output or temp directory to be the one you made. For example:
$ singularity exec --overlay my_overlay maker.sif maker [options] --TMP /maker_out_dir
 
# To view the files from your run
$ singularity exec --overlay my_overlay maker.sif ls /maker_out_dir

# To copy useful files out to Setonix (copying to the current directory). 
#Note how we’re wrapping the copy command within bash -c; this is to defer the evaluation of the * wildcard to when the container runs the command.
$ singularity exec --overlay my_overlay maker.sif bash -c 'cp -p /maker_out_dir/something.fa* ./


Here are some additional things to keep in mind when using the size flag:

  • The size of the overlay filesystem is the maximum size of the overlay. The actual size of the overlay may be smaller if there is not enough data to fill it up.
  • The size of the overlay filesystem is limited by the amount of available disk space.


You can use the same overlay with different containers. This is because the newly created directories and files are persistent and therefore can be re-accessed and re-used in future runs, even by containers instantiated from different images. All we have to do is to mount the filesystem image my_overlay.