HPC workflow using managed storage via rclone

The purpose of this page is to demonstrate how to work with managed storage in a slurm workflow at Pawsey. It is intended for technical users, that are familiar with bash scripting and are creating new or are migrating existing workflows. It is intended to help the user get started or become more familiar with HPC workflow patterns using a realistic but simple example. Important points demonstrated are: inter-cluster scheduling, rclone, singularity and interaction with the new managed storage.

All the scripts for that example can be found on the LTS template repo.

For further information regarding this workflow, please contact Kosta Servis (Unlicensed)

Scenario

You are a technical officer at a large astronomy research centre that needs to implement an on-demand workflow for end-users. Your end-users are interested in one or more astronomical objects 20 pixels wide from a collection of very large images.

You have created a web interface (out of scope of this example) on a VM on a cloud platform (such as Pawsey Nimbus or AWS EC2) on which users can identify the images and objects that they are interested in and can specify the RA-DEC (see Equatorial coordinate system if unfamiliar) of the object. The web service submits a message to a dedicated queue (they chose AWS SQS ). The message body is as follows:

{
"file": "lts-template/data/frame-z-004874-4-0422.fits",
"ra": "0.1",
"dec": "0.25"
}

You have also created a simple script (out of scope for this example) that uses scrontab (see /wiki/spaces/PSC/pages/57369466 and Scrontab manpage) to read that message out of the queue and uses it to schedule the job that does the cropping.

For the scrontab script to schedule the job on the cluster, it creates a file called files.txt with the above file and sets the RA_OBJECT and DEC_OBJECT env variables (for the example below they have been hard-coded so that you can more easily run this example independently). It then runs the script called schedule.sh files.txt , which is explained below.

Scheduling the cropping job

The workflow is very simple as it consists of only four steps:

The following is a copy of the script to schedule the job at the time of writing:

Schedule.sh

#!/bin/bash

FILE_LIST=$1

# Submit to copyq and get job id for next step
JOB_ID=$(sbatch -p copyq -M zeus \
            --export=RCLONE_CONFIG_CEPH_SECRET_ACCESS_KEY,RCLONE_CONFIG_CEPH_ACCESS_KEY_ID \
            sync.sh ceph:lts-template/data $MYSCRATCH/lts-template/data $FILE_LIST \
            | perl -ne 'm/(\d+).*$/g; print $1;' )
echo "Scheduled sync-in JOB_ID: $JOB_ID"

# Keeping it simple just running for one file
FILE=$(head -1 $FILE_LIST)
XY_FILE=${FILE/.fits/_crop.xy}
# Right ascension of object of interest
RA_OBJECT=0.1 
# Declination of object of interest
DEC_OBJECT=0.25 

JOB_ID2=$(sbatch -p debugq -M magnus -d afterok:$JOB_ID \
    get_world_2_pix.sh $MYSCRATCH/lts-template/data/$FILE \
                       $RA_OBJECT $DEC_OBJECT \
                       $MYSCRATCH/lts-template/data/$XY_FILE \
    | perl -ne 'm/(\d+).*$/g; print $1;' )
echo "Scheduled sky2xy JOB_ID: $JOB_ID2"

OUTPUT_FILE=${FILE/.fits/_crop.fits}

# Label (a string e.g. ngc123 ) of object of interest
LABEL_OBJECT=galaxy_1
# Number of pixels around object of interest to crop 
PIXELS_AROUND_CENTRE=20

# This script will assume there is only one object and one set of coordinates
JOB_ID3=$(sbatch -p debugq -M magnus -d afterok:$JOB_ID2 \
    crop_image_around_pixels.sh $MYSCRATCH/lts-template/data/$FILE \
                                $MYSCRATCH/lts-template/data/$OUTPUT_FILE \
                                $MYSCRATCH/lts-template/data/$XY_FILE \
                                $LABEL_OBJECT \
                                $PIXELS_AROUND_CENTRE \
    | perl -ne 'm/(\d+).*$/g; print $1;' )
echo "Scheduled crop JOB_ID: $JOB_ID3"

echo "$MYSCRATCH/lts-template/data/$OUTPUT_FILE" > sync_back_list.txt

# Submit to copyq and get job id for next step
JOB_ID4=$(sbatch -p copyq -M zeus -d afterok:$JOB_ID3 \
            --export=RCLONE_CONFIG_CEPH_SECRET_ACCESS_KEY,RCLONE_CONFIG_CEPH_ACCESS_KEY_ID \
            sync.sh $MYSCRATCH/lts-template/data ceph:lts-template/data sync_back_list.txt \
            | perl -ne 'm/(\d+).*$/g; print $1;' )
echo "Scheduled sync-in JOB_ID: $JOB_ID4"

# You could add another job here to perform other tasks (such as reporting back to an external job queue or housekeeping)

The script schedules 4 jobs while retaining the job id of each to use in the subsequent invocation in order to delay execution until the previous has exited with a success code ({{-d afterok}}).

The last and the first steps both use the sync.sh script (see below) which expects the environment to two variables contained your credentials to ceph. You can set those simply with:

Env vars with credentials

export RCLONE_CONFIG_CEPH_ACCESS_KEY_ID="deadbeefdeadbeefdeadbeef"
export RCLONE_CONFIG_CEPH_SECRET_ACCESS_KEY="deadbeefdeadbeefdeadbeef"

Let us now look in detail at the 4 steps

The sync step

The sync step uses the rclone module (currently available on zeus where the copyq is) to get files.

Currently, the script looks like so:

sync.sh

#!/bin/bash

# e.g. paths (input or output):
# "ceph:lts-template/data"
# "$MYSCRATCH/lts-template/data"

FROM_PATH=$1
TO_PATH=$2
FILE_LIST=$([ "$3" == "" ] && echo "" || echo "--files-from="$(realpath $3)) # optional list of files to avoid syncing everything in the source

module load rclone 

# Setup remote
export RCLONE_CONFIG_CEPH_TYPE="s3"
# Following endpoint likely to change (possibly the port) when the prod system is launched
export RCLONE_CONFIG_CEPH_ENDPOINT="https://nimbus.pawsey.org.au:8080"

# For non-public resources you will need your credentials in this script or the environment e.g.:
# export RCLONE_CONFIG_CEPH_ACCESS_KEY_ID="deadbeefdeadbeefdeadbeef"
# export RCLONE_CONFIG_CEPH_SECRET_ACCESS_KEY="deadbeefdeadbeefdeadbeef"
 
rclone sync $FILE_LIST $FROM_PATH $TO_PATH

As you can see it takes 2 and an optional third argument for the input path to sync from, output to sync to and the list of files. If the final argument is not given it will sync all files from the source path to the destination path.

The most important part to note is the lines beginning with {{export }} setting the environment variables for rclone. Rclone will take the part between RCLONE_CONFIG_ and _TYPE or _ENDPOINT as the name of the alias in rclone parlance (the remote), meaning that the environment variables both configure and define an alias, ceph in this case (in lowercase despite the env var being in uppercase). Rclone can interface with almost all cloud providers, meaning that users can download or upload data from/to virtually any service imaginable. For more details refer to the official documentation on the rclone page.

This particular script also expects the credentials variables as noted above.

The world to pixel step

This step is running on a single node on your compute cluster (in this case magnus ).

The script currently looks like so:

#!/bin/bash

INPUT_FILE_NAME=$1
RA=$2
DEC=$3
OUTPUT_FILE_NAME=$4

DOCKER_IMAGE=curtinfop/wcstools

module load singularity

singularity run docker://$DOCKER_IMAGE sky2xy \
        $INPUT_FILE_NAME \
        $RA $DEC > $OUTPUT_FILE_NAME

What is important to note here is that it uses the singularity module to execute a docker container directly from the docker hub (in this case the one from https://hub.docker.com/r/curtinfop/wcstools ) which simply contains a compiled copy of WCS Tools.

Part of wcstools is the sky2xy that given a fits file with WCS attributes in the header can give us the pixel coordinates that correspond to some sky coordinates (the will be marked as off image if they are not contained in the image) and writes those to the provided output file name.

The cropping step

This step is running on a single node on your compute cluster (in this case magnus ).

The script currently looks like so:

#!/bin/bash


INPUT_FILE_NAME=$1
OUTPUT_FILE_NAME=$2
XY_FILE_NAME=$3
OBJ_ID=$4
WIDTH_IN_PIXELS=$5

module load singularity

DOCKER_IMAGE=curtinfop/gnuastro

CENTRE_COORDINATES=$(perl -lne '/(off image|offscale)/ and exit 1 or @a=split /\s+/ and print join ",", @a[4..5];' < $XY_FILE_NAME)

# If there is more than one set of coordinates this needs to become a loop

singularity run docker://$DOCKER_IMAGE astcrop \
                                    -o $OUTPUT_FILE_NAME \
                                    -h0 \
                                    --mode=img \
                                    --center=$CENTRE_COORDINATES \
                                    --width=$WIDTH_IN_PIXELS \
                                    $INPUT_FILE_NAME

This is another simple script that simply executes a docker container containing the gnuastro tools from which it simply calls astcrop which is a very fast tool for cropping fits images.

Finally

Once all steps have completed successfully the sync.sh script with the input and output arguments reversed and a new file list argument writes back the results to ceph. Possible subsequent steps could be to write a success message back to SQS in order to notify the front end to in turn notify the user.