Users own software stack

Users own software

Manual builds

Researchers need to recompile their own manually built software that has been built previously on Setonix. This is necessary because the CPE has newer versions of various libraries, new paths to libraries (specifically MPICH) to ensure the best possible performance and avoid issues. This process also might require an update to any modules loaded since versions will change. 


Conda/Mamba 

In general, your software installed using Conda/Mamba should not be affected by the updates. The exception would be if you have installed software using the conda install --use-local option. The --use-local  option uses local files to do the package installation, rather than the external channels that Conda typically uses (e.g. bioconda of conda-forge). The paths of the local files may have changed, and would need to up rebuilt using the updated paths. In general, the default option is to use the external channels such and conda-forge, so we do not expect this to impact many users. 

The version of R provided as a module has changed from 4.1.0 to 4.2.2. This may cause issues with your installed R libraries and packages and require you to update your installed versions to be compatible with this newer version of R. We have provided an example to largely automate the re-installation of your software against the new version of R. This may not successfully reinstall all of your packages, but it should remove much of the burden of migrating to the newer R version. 


Step one is to collect the packages you had installed with the previous version of R and save them to a CSV file for use in the next step.

Obtaining packages to be updated
# Swap the pawseyenv module to the previous software stack and load the old R module version
$ module swap pawseyenv pawseyenv/2022.11
$ module load r/4.1.0

# Open an R session
$ R
# Get a list of your installed packages
> installed_packages <- installed.packages()
> write.csv(installed_packages, file = "installed_packages.csv")
# close the R session 
> quit()


Step two is to create a batch script to send to SLURM to do the installation based on the R script we will prepare in step three. 

install_packages.sh
#!/bin/bash -l
#SBATCH --account=XXXX
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBACTH --mem=16GB
#SBATCH --time=05:00:00 #you may need more or less time depending on how many packages you have to install

#Load the new R module
module load r/4.2.2

#Run the R script below 
srun -N $SLURM_JOB_NUM_NODES -n $SLURM_NTASKS -c $SLURM_CPUS_PER_TASK Rscript update_packages.r


Step three is to create the R script that will do the installation for you. This uses the CSV file we created in step one. The R script will use the number of CPUs you provided it in the batch script (in this example, that would be four). This script will also tell you which packages were successful or unsuccessful in the SLURM log file. 

update_packages.r
# update_packages_with_summary.R

# Load the necessary libraries
library(tools)

# Function to update packages
update_packages <- function() {
  # Get the path to the user's R library
  r_lib_path <- .libPaths()[1]
  
  # Load the saved list of installed packages
  installed_packages <- read.csv("installed_packages2.csv")
  
  # Get the names of the installed packages
  packages_to_update <- installed_packages[,"Package"]
  
  # Initialize lists to keep track of successes and failures
  success_packages <- character(0)
  failure_packages <- character(0)

  # Get the number of cores provided by SLURM
  num_cores <- as.integer(Sys.getenv("SLURM_CPUS_PER_TASK"))
  
  # Reinstall or update each package
  for (package in packages_to_update) {
    package_install_path <- file.path(r_lib_path, package)
    
    # Try installing the package with dependencies
    tryCatch({
      install.packages(package, lib = r_lib_path, dependencies = TRUE, 
                       Ncpus = num_cores, 
                       repos = "https://cran.rstudio.com/")
      
      # Copy the package data from the old library to the new one
      if (dir.exists(package_install_path)) {
        package_data <- list.files(package_install_path, pattern = "\\.[RD]$", 
                                   full.names = TRUE)
        file.copy(package_data, package_install_path, overwrite = TRUE)
      }
      
      # Add to the list of successfully updated packages
      success_packages <- c(success_packages, package)
    }, error = function(e) {
      # Add to the list of failed packages
      failure_packages <- c(failure_packages, package)
    })
  }
  
  # Print summary of updates
  cat("Summary of Package Updates:\n")
  cat("Successfully updated packages:", paste(success_packages, collapse = ", "), "\n")
  cat("Failed to update packages:", paste(failure_packages, collapse = ", "), "\n")
}

# Run the update_packages function
update_packages()

If you find that you really need a specific version of R, you might need to install your own version of R in your /software partition. Although there are many ways to install R, one handy way can be using Conda/Mamba, which can install R packages for you as well. 

Python virtual environment 

Python virtual environments will need to be rebuilt if the environment made use of Pawsey provided modules to deploy the environment. The major one is that the python module is now 3.10.10. The python provided in the previous software installation will still be present but will not be available by default unless one loads the pawsey environment module associated with the previous deployment, pawseyenv/2022.11 . We then suggest activating the environment to extract the installed packages. 


Terminal : update python virutal environment
# extract packages installed in virtual environment
module swap pawseyenv/2022.11
module load python/3.9.15

# activate python environment to get requirements
source <path_to_env>/bin/activate 
pip freeze > requirements.txt 
deactivate 

# and create new environment
module unload python/3.9.15
module swap pawseyenv/2023.08
module load python/3.10.10
python -m venv <path_to_new_env>
source <path_to_new_env>/bin/activate 
pip install -r requirements.txt


Spack installations

Researchers that installed software with spack/0.17.0  will need to load the older software stack to load this particular version of spack and query it to get the previously installed packages. The steps involved are

  • Load the old pawsey environment module and older version of spack
  • query spack to find all the old builds in your software space. For users in multiple projects, this will require querying each project (see here for how to change projects)
  • generate scripts to see if specifications are acceptable and then rebuild with newer spack 

Pawsey provides a script to migrate software stacks called spack_generate_migration_scripts.sh. This script will generate the scripts that can be used to check specifications and run the installation, called spack.specs.sh and spack.install.sh respectively. These scripts may need to be edited to get the desired results as they will endeavour to build the software with the same build time options but not necessarily the same dependencies.  

Once you are satisfied with the new builds, please clean up the old builds

Uninstall old spack installed packages
# clean up old installation
module swap pawseyenv pawseyenv/2022.11
# load the older spack version
module load spack/0.17.0
# look for modules within your $MYSOFTWARE installation 
hashlist=($(lfs find /software/projects/pawsey0001/pelahi/setonix/modules/ -name *.lua | sed "s/.lua//g" | sed "s/-/ /g" | awk '{print $2}'))
for h in ${hashlist[@]}
do
  spack uninstall /${h}
done
spack clean -a