...
Staging can be performed with a slurm job script that makes use of the data-transfer nodes (copy
partition) to handle initial data originally stored in Acacia and to be staged into the working directory on /scratch
. Note that the original data is packed within a .tar
file, so the script also performs the "untarring" of the data:
Ui tabs |
---|
Ui tab |
---|
|
900pxbashEmacs Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Listing 1.rclone stageFromAcaciaTar.sh |
---|
linenumbers | true |
---|
|
#!/bin/bash --login
#---------------
#About this script
#stageFromAcaciaTar.sh : copies a tar object from Acacia and extracts it in the destination path
#---------------
#Requested resources:
#SBATCH --account=[yourProjectName]
#SBATCH --job-name=stageTar.rclone
#SBATCH --partition=copy
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --time=[requiredTime]
#SBATCH --export=NONE
#-----------------
#Loading the required modules
module load rclone/<version> #This example is performed with the use of the rclone.
#-----------------
#Defining variables that will handle the names related to your access, buckets and stored objects in Acacia
profileName=<profileNameGivenToYourProfileOfAccessToAcacia>
bucketName=<bucketInAcaciaContainingTheData>
prefixPath=<prefixPathInBucketUsedToOrginiseTheData>
fullPathInAcacia="${profileName}:${bucketName}/${prefixPath}" #Note the colon(:) when using rclone
#-----------------
#Name of the file to be transferred and auxiliary dir to temporarily place it
tarFileName=<nameOfTheTarFileContainingInitialData>
auxiliaryDirForTars="$MYSCRATCH/tars"
echo "Checking that the auxiliary directory exists"
if ! [ -d $auxiliaryDirForTars ]; then
echo "Trying to create the auxiliary directory as it does not exist"
mkdir -p $auxiliaryDirForTars; exitcode=$?
if [ $exitcode -ne 0 ]; then
echo "The auxiliary directory $auxiliaryDirForTars does not exist and can't be created"
echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
exit 1
fi
fi
#-----------------
#Working directory in scratch for the supercomputing job
workingDir="$MYSCRATCH/<workingDirectoryForSupercomputingJob>"
echo "Checking that the working directory exists"
if ! [ -d $workingDir ]; then
echo "Trying to create the working directory as it does not exist"
mkdir -p $workingDir; exitcode=$?
if [ $exitcode -ne 0 ]; then
echo "The working directory $workingDir does not exist and can't be created"
echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
exit 1
fi
fi
#-----------------
#Check if Acacia definitions make sense, and if the object to transfer exist
echo "Checking that the profile exists" rclone config show | grep "${profileName}" > /dev/null; exitcode=$?
if [ $exitcode -ne 0 ]; then
echo "The given profileName=$profileName seems not to exist in the user configuration of rclone"
echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
exit 1
fi
echo "Checking that bucket exists and that you have read access"
rclone lsd "${profileName}:${bucketName}" > /dev/null; exitcode=$? #Note the colon(:) when using rclone
if [ $exitcode -ne 0 ]; then
echo "The bucket name or the profile name may be wrong: ${profileName}:${bucketName}"
echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
exit 1
fi
echo "Checking if the file can be listed in Acacia"
listResult=$(rclone lsl "${fullPathInAcacia}/${tarFileName}")
if [ -z "$listResult" ]; then
echo "Problems occurred during the listing of the file ${tarFileName}"
echo "Check that the file exists in the fullPathInAcacia: ${fullPathInAcacia}/"
echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
exit 1
fi
#-----------------
#Perform the transfer of the tar file into the auxiliary directory and check for the transfer
echo "Performing the transfer ... "
srun rclone copy "${fullPathInAcacia}/${tarFileName}" "${auxiliaryDirForTars}/"; exitcode=$?
if [ $exitcode -ne 0 ]; then
echo "Problems occurred during the transfer of file ${tarFileName}"
echo "Check that the file exists in the fullPathInAcacia: ${fullPathInAcacia}/"
echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
exit 1
fi
#-----------------
#Perform untaring with desired options into the working directory
echo "Performing the untarring ... "
#tarOptions=( --strip-components 8 ) #Avoiding creation of some directories in the path
srun tar -xvzf "${auxiliaryDirForTars}/${tarFileName}" -C $workingDir "${tarOptions[@]}"; exitcode=$?
if [ $exitcode -ne 0 ]; then
echo "Problems occurred during the untaring of file ${auxiliaryDirForTars}/${tarFileName}"
echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
exit 1
else
echo "Removing the tar file as it has been successfully untarred"
rm $auxiliaryDirForTars/$tarFileName #comment this line when debugging workflow
fi
#-----------------
## Final checks ....
#---------------
#Successfully finished
echo "Done"
exit 0 |
Note |
---|
|
- Note the use of variables to store the names of directories, files, buckets, prefixes, objects etc.
- Also note the several checks at the different parts of the script and the redirection to
/dev/null in most of the commands used for checking correctness (as we are not interested in their output). - As messages from the mc client are too verbose when transferring files (even with the
--quiet option), we make use of a redirection of the output messages to /dev/null when performing transfers with this client on scripts. For this reason we often find rclone a better choice for the use of clients in scripts.
|
...