Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Staging can be performed with a slurm job script that makes use of the data-transfer nodes (copy partition) to handle initial data originally stored in Acacia and to be staged into the working directory on /scratch. Note that the original data is packed within a .tar file, so the script also performs the "untarring" of the data:

900pxbashEmacstruenote
Ui tabs
Ui tab
titlerclone
Code Block
languagebash
themeEmacs
titleListing 1.rclone stageFromAcaciaTar.sh
Code Block
languagebash
themeEmacs
titleListing 2. superExecutionSetonix.sh
linenumberstrue
#!/bin/bash -l

#SBATCH --account=[yourProject]
#SBATCH --job-name=superExecution
#SBATCH --partition=work
#SBATCH --ntasks=[numberOfCoresToUse]
#SBATCH --time=[requiredTime]
#SBATCH --export=none

#--------------
#Load required modules here ...

#---------------
#Defining the working dir
workingDir="$MYSCRATCH/<pathAndNameOfWorkingDirectory"

#---------------
#Entering the working dir
cd $workingDir

#---------------
#Check for the correct staging of the initial conditions if needed ...

#---------------
#Supercomputing execution
srun <theTool> <neededArguments>

#---------------
#Successfully finished
echo "Done"
exit 0

Script for storing of results into Acacia

Storing can be performed with a slurm job script that makes use of the data-transfer nodes (copy partition) to handle new results/data in the working directory on /scratch and store them in Acacia. Note that data is first packed into a .tar file, and transferred to Acacia afterwards.

900pxbashEmacsListing 3.rclone storeIntoAcaciaTar.shtrue900pxbashEmacsListing 3.mc storeIntoAcaciaTar.shtrue
Ui tabs
Ui tab
titlerclone
Ui tab
titlemc (minio)
linenumbers
titleClient support
  • Note the use of variables to store the names of directories, files, buckets, prefixes, objects etc.
  • Also note the several checks at the different parts of the script and the redirection to /dev/null in most of the commands used for checking correctness (as we are not interested in their output).
  • As messages from the mc client are too verbose when transferring files (even with the --quiet option), we make use of a redirection of the output messages to /dev/null when performing transfers with this client on scripts. For this reason we often find rclone a better choice for the use of clients in scripts.

Script for performing a supercomputing job

The slurm job script for performing the supercomputing job makes use of the initial data staged in the previous step. This job requests execution in the compute nodes (work partition).

...

width900px
true
#!/bin/bash --login

#---------------
#About this script
#stageFromAcaciaTar.sh : copies a tar object from Acacia and extracts it in the destination path

#---------------
#Requested resources:
#SBATCH --account=[yourProjectName]
#SBATCH --job-name=stageTar.rclone
#SBATCH --partition=copy
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --time=[requiredTime]
#SBATCH --export=NONE

#-----------------
#Loading the required modules
module load rclone/<version> #This example is performed with the use of the rclone. 

#-----------------
#Defining variables that will handle the names related to your access, buckets and stored objects in Acacia
profileName=<profileNameGivenToYourProfileOfAccessToAcacia>
bucketName=<bucketInAcaciaContainingTheData>
prefixPath=<prefixPathInBucketUsedToOrginiseTheData>
fullPathInAcacia="${profileName}:${bucketName}/${prefixPath}" #Note the colon(:) when using rclone

#-----------------
#Name of the file to be transferred and auxiliary dir to temporarily place it
tarFileName=<nameOfTheTarFileContainingInitialData>
auxiliaryDirForTars="$MYSCRATCH/tars"
echo "Checking that the auxiliary directory exists"
if ! [ -d $auxiliaryDirForTars ]; then
   echo "Trying to create the auxiliary directory as it does not exist"
   mkdir -p $auxiliaryDirForTars; exitcode=$?
   if [ $exitcode -ne 0 ]; then
      echo "The auxiliary directory $auxiliaryDirForTars does not exist and can't be created"
      echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
      exit 1
   fi
fi

#-----------------
#Working directory in scratch for the supercomputing job
workingDir="$MYSCRATCH/<workingDirectoryForSupercomputingJob>"
echo "Checking that the working directory exists"
if ! [ -d $workingDir ]; then
   echo "Trying to create the working directory as it does not exist"
   mkdir -p $workingDir; exitcode=$?
   if [ $exitcode -ne 0 ]; then
      echo "The working directory $workingDir does not exist and can't be created"
      echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
      exit 1
   fi
fi

#-----------------
#Check if Acacia definitions make sense, and if the object to transfer exist
echo "Checking that the profile exists"
rclone config show | grep "${profileName}" > /dev/null; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "The given profileName=$profileName seems not to exist in the user configuration of rclone"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi
echo "Checking that bucket exists and that you have read access"
rclone lsd "${profileName}:${bucketName}" > /dev/null; exitcode=$? #Note the colon(:) when using rclone
if [ $exitcode -ne 0 ]; then
   echo "The bucket name or the profile name may be wrong: ${profileName}:${bucketName}"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi
echo "Checking if the file can be listed in Acacia"
listResult=$(rclone lsl "${fullPathInAcacia}/${tarFileName}")
if [ -z "$listResult" ]; then
   echo "Problems occurred during the listing of the file ${tarFileName}"
   echo "Check that the file exists in the fullPathInAcacia: ${fullPathInAcacia}/"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi

#-----------------
#Perform the transfer of the tar file into the auxiliary directory and check for the transfer
echo "Performing the transfer ... "
srun rclone copy "${fullPathInAcacia}/${tarFileName}" "${auxiliaryDirForTars}/"; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "Problems occurred during the transfer of file ${tarFileName}"
   echo "Check that the file exists in the fullPathInAcacia: ${fullPathInAcacia}/"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi

#-----------------
#Perform untaring with desired options into the working directory
echo "Performing the untarring ... "
#tarOptions=( --strip-components 8 ) #Avoiding creation of some directories in the path
srun tar -xvzf "${auxiliaryDirForTars}/${tarFileName}" -C $workingDir "${tarOptions[@]}"; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "Problems occurred during the untaring of file ${auxiliaryDirForTars}/${tarFileName}"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
else
   echo "Removing the tar file as it has been successfully untarred"
   rm $auxiliaryDirForTars/$tarFileName #comment this line when debugging workflow
fi

#-----------------
## Final checks ....

#---------------
#Successfully finished
echo "Done"
exit 0


Note
titleClient support
  • Note the use of variables to store the names of directories, files, buckets, prefixes, objects etc.
  • Also note the several checks at the different parts of the script and the redirection to /dev/null in most of the commands used for checking correctness (as we are not interested in their output).
  • As messages from the mc client are too verbose when transferring files (even with the --quiet option), we make use of a redirection of the output messages to /dev/null when performing transfers with this client on scripts. For this reason we often find rclone a better choice for the use of clients in scripts.




Script for performing a supercomputing job

The slurm job script for performing the supercomputing job makes use of the initial data staged in the previous step. This job requests execution in the compute nodes (work partition).

Column
width900px


Code Block
languagebash
themeEmacs
titleListing 2. superExecutionSetonix.sh
linenumberstrue
#!/bin/bash -l

#SBATCH --account=[yourProject]
#SBATCH --job-name=superExecution
#SBATCH --partition=work
#SBATCH --ntasks=[numberOfCoresToUse]
#SBATCH --time=[requiredTime]
#SBATCH --export=none

#--------------
#Load required modules here ...

#---------------
#Defining the working dir
workingDir="$MYSCRATCH/<pathAndNameOfWorkingDirectory"

#---------------
#Entering the working dir
cd $workingDir

#---------------
#Check for the correct staging of the initial conditions if needed ...

#---------------
#Supercomputing execution
srun <theTool> <neededArguments>

#---------------
#Successfully finished
echo "Done"
exit 0


Script for storing of results into Acacia

Storing can be performed with a slurm job script that makes use of the data-transfer nodes (copy partition) to handle new results/data in the working directory on /scratch and store them in Acacia. Note that data is first packed into a .tar file, and transferred to Acacia afterwards.

Code Block
languagebash
themeEmacs
titleListing 3.rclone storeIntoAcaciaTar.sh
linenumberstrue
#!/bin/bash --login

#SBATCH --account=[yourProjectName]
#SBATCH --job-name=storeTar
#SBATCH --partition=copy
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --time=[requiredTime]
#SBATCH --export=NONE

#-----------------
#Loading the required modules
module load rclone/<version> #This example is performed with the use of the rclone. 

#-----------------
#Defining variables that will hold the names related to your access, buckets and objects to be stored in Acacia
profileName=<profileNameGivenToYourProfileOfAccessToAcacia>
bucketName=<bucketInAcaciaContainingTheData>
prefixPath=<prefixPathInBucketUsedToOrginiseTheData>
fullPathInAcacia="${profileName}:${bucketName}/${prefixPath}" #Note the colon(:) when using rclone

#-----------------
#Check if Acacia definitions make sense, and if you can transfer objects into the desired bucket
echo "Checking that the profile exists"
rclone config show | grep "${profileName}" > /dev/null; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "The given profileName=$profileName seems not to exist in the user configuration of rclone"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi
echo "Checking the bucket exists and that you have writing access"
rclone lsd "${profileName}:${bucketName}" > /dev/null; exitcode=$? #Note the colon(:) when using rclone
if [ $exitcode -ne 0 ]; then
   echo "The bucket intended to receive the data does not exist: ${profileName}:${bucketName}"
   echo "Trying to create it"
   rclone mkdir "${profileName}:${bucketName}"; exitcode=$?
   if [ $exitcode -ne 0 ]; then
      echo "Creation of bucket failed"
      echo "The bucket name or the profile name may be wrong: ${profileName}:${bucketName}"
      echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
      exit 1
   fi
fi
echo "Checking if a test file can be trasferred into the desired full path in Acacia"
testFile=test_file_${SLURM_JOBID}.txt
echo "File for test" > "${testFile}"
rclone copy "${testFile}" "${fullPathInAcacia}/"; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "The test file $testFile cannot be transferred into ${fullPathInAcacia}"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi
echo "Checking if the test file can be listed in Acacia"
listResult=$(rclone lsl "${fullPathInAcacia}/${testFile}")
if [ -z "$listResult" ]; then
   echo "Problems occurred during the listing of the test file ${testFile} in ${fullPathInAcacia}"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi
echo "Removing test file from Acacia"
rclone delete "${fullPathInAcacia}/${testFile}"; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "The test file $testFile cannot be removed from ${fullPathInAcacia}"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi
rm $testFile

#----------------
#Defining the working dir and cd into it
workingDir="$MYSCRATCH/<workingDirectoryOfSupercomputingJob>"
echo "Checking that the working directory exists"
if ! [ -d $workingDir ]; then
   echo "The working directory $workingDir does not exist"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
else
   cd $workingDir
fi

#----------------
#Defining what to tar and the name and place of the tarfile
whatToTar=( file1 dir1/ dir2/file2 dir3/file* ) #These are inside the working dir
tarFileName=case_A1-newData.tar.gz
auxiliaryDirForTars="$MYSCRATCH/tars"
echo "Checking that the auxiliary directory exists"
if ! [ -d $auxiliaryDirForTars ]; then
   echo "Trying to create the auxiliary directory as it does not exist"
   mkdir -p $auxiliaryDirForTars; exitcode=$?
   if [ $exitcode -ne 0 ]; then
      echo "The auxiliary directory $auxiliaryDirForTars does not exist and can't be created"
      echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
      exit 1
   fi
fi

#----------------
#Taring the indicated files and directories
srun tar -cvzf "${auxiliaryDirForTars}/${tarFileName}" "${whatToTar[@]}"; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "Something went wrong when taring:"
   echo "tarFileName=${tarFileName}"
   echo "whatToTar=( ${whatToTar[@]} )"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
fi

#-----------------
#Perform the transfer of the tar file into the working directory and check for the transfer
echo "Performing the transfer ... "
srun rclone copy "${auxiliaryDirForTars}/${tarFileName}" "${fullPathInAcacia}/"; exitcode=$?
if [ $exitcode -ne 0 ]; then
   echo "Problems occurred during the transfer of file ${tarFileName}"
   echo "Check that the file exists in ${workingDir}"
   echo "And that nothing is wrong with the fullPathInAcacia: ${fullPathInAcacia}/"
   echo "Exiting the script with non-zero code in order to inform job dependencies not to continue."
   exit 1
else
   echo "Removing the tar file from scratch as it was successfully transferred into Acacia"
   echo "Final place in Acacia: ${fullPathInAcacia}/${tarFileName}"
   rm "${auxiliaryDirForTars}/${tarFileName}" #Comment this line when debugging workflow
fi

#---------------
# Final checks ...

#---------------
#Successfully finished
echo "Done"
exit 0

Coordinating the different steps (scripts) with job dependencies

...

In principle the same scripts can be used in other systems, but they will need to be adapted accordingly. The main adaptation would be the impossiblity of coordinating the different steps through dependencies when scrips are to be executed on different clusters; for example, using the data mover nodes on zeus (copyq) and the compute nodes on magnus (workq) for the supercomputing job.

Related pages