This page describes how to run Pangeo and Dask with JupyterHub using a python virtual environment on Setonix with Slurm. This involves launching JupyterHub and then connecting to the Jupyter server.

Installation

Here we use a python virtual environment for the installation of Pangeo and Dask. This allows users to add new python packages as required during their analysis. The installation steps are described below:

Terminal 1: Installing Dask and dependencies

$ cd $MYSOFTWARE # This path defaults to /software/projects/<user_project>/<username>/ 

# Load the required modules. Note that the specific version numbers may change
$ module load python/3.10.10
$ module load py-pip/23.1.2-py3.10.10

# Create and activate the virtual environment 
$ python -m venv pangeo
$ source ${MYSOFTWARE}/pangeo/bin/activate

# Install Pangeo, Dask, and dependencies
$ pip install dask-mpi dask distributed mpi4py jupyter-server-proxy jupyterlab ipywidgets xarray zarr numcodecs hvplot geoviews datashader widgetsnbextension dask-jobqueue dask-labextension notebook wheel netCDF4 pyFFTW basemap geos nodejs-bin

# Clean your pip cache to prevent exceeding your file count quota on the /software partition
$ pip cache purge

Setting up the batch script

Once you have the python virtual environment installed, you can then launch your Jupyter Notebook on Setonix with the following script. This script activates the python virtual environment and launches a Jupyter Notebook on a node in the work partition. You will need to edit the Slurm parameters and working directory to suit your needs.

Listing 1. Launch_jupyter.sh

#!/bin/bash -l
# Allocate slurm resources, edit as necessary
#SBATCH --account=your_pawsey_account
#SBATCH --partition=work
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=8GB
#SBATCH --time=02:00:00
#SBATCH --nodes=1
#SBATCH --job-name=jupyter_notebook
#SBATCH --export=NONE
 
# Set our working directory
# Should be in a writable path with some space, like /scratch
dir="${MYSCRATCH}/jupyter"
 
# Load dependencies for pangeo
module load python/3.10.10
source ${MYSOFTWARE}/pangeo/bin/activate

# You should not need to edit the lines below
 
# Prepare the working directory
mkdir -p ${dir}
cd ${dir}

# Get the hostname
# We'll set up an SSH tunnel to connect to the Juypter notebook server
host=$(hostname)
 
# Set the port for the SSH tunnel
# This part of the script uses a loop to search for available ports on the node;
# this will allow multiple instances of GUI servers to be run from the same host node
port="8888"
pfound="0"
while [ $port -lt 65535 ] ; do
  check=$( ss -tuna | awk '{print $4}' | grep ":$port *" )
  if [ "$check" == "" ] ; then
    pfound="1"
    break
  fi
  : $((++port))
done
if [ $pfound -eq 0 ] ; then
  echo "No available communication port found to establish the SSH tunnel."
  echo "Try again later. Exiting."
  exit
fi

 
echo "*****************************************************"
echo "Setup - from your laptop do:"
echo "ssh -N -f -L ${port}:${host}:${port} $USER@$PAWSEY_CLUSTER.pawsey.org.au"
echo "*****"
echo "The launch directory is: $dir"
echo "*****************************************************"
echo ""
echo "*****************************************************"
echo "Terminate - from your laptop do:"
echo "kill \$( ps x | grep 'ssh.*-L *${port}:${host}:${port}' | awk '{print \$1}' )"
echo "*****************************************************"
echo ""
  
#Launch the notebook 
srun -N $SLURM_JOB_NUM_NODES -n $SLURM_NTASKS -c $SLURM_CPUS_PER_TASK \
	jupyter lab \
  --no-browser \
  --port=${port} --ip=0.0.0.0 \
  --notebook-dir=${dir}

Please note that the port forwarding will not work correctly if you run Jupyter on the login node.

Run your Jupyter notebook server

To start, submit the SLURM jobscript. It will take a few minutes to start (depending on how busy the queue and how large of an image you're downloading). Once the job starts you will have a SLURM output file in your directory, which will have instructions on how to connect at the end.

Terminal 2. Submitting sbatch script

$ sbatch Launch_jupyter.sh
Submitted batch job 2850635 
$ cat slurm-2850635.out
.
. *****************************************************
Setup - from your laptop do:
ssh -N -f -L 8888:nid002024:8888 sbeecroft@setonix.pawsey.org.au
*****
The launch directory is: /scratch/pawsey0001/sbeecroft/jupyter
*****************************************************

*****************************************************
Terminate - from your laptop do:
kill $( ps x | grep 'ssh.*-L *8888:nid002024:8888' | awk '{print $1}' )
*****************************************************

[I 2023-07-06 14:15:07.312 ServerApp] Package jupyterlab took 0.0000s to import
[I 2023-07-06 14:15:07.379 ServerApp] Package jupyter_lsp took 0.0670s to import
[W 2023-07-06 14:15:07.379 ServerApp] A `_jupyter_server_extension_points` function was not found in jupyter_lsp. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
[I 2023-07-06 14:15:07.576 ServerApp] Package jupyter_server_proxy took 0.1962s to import
[I 2023-07-06 14:15:07.603 ServerApp] Package jupyter_server_terminals took 0.0274s to import
[I 2023-07-06 14:15:07.604 ServerApp] Package notebook_shim took 0.0000s to import
[W 2023-07-06 14:15:07.604 ServerApp] A `_jupyter_server_extension_points` function was not found in notebook_shim. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
[I 2023-07-06 14:15:07.605 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2023-07-06 14:15:07.605 ServerApp] jupyter_server_proxy | extension was successfully linked.
[I 2023-07-06 14:15:07.610 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2023-07-06 14:15:07.614 ServerApp] jupyterlab | extension was successfully linked.
[I 2023-07-06 14:15:08.417 ServerApp] notebook_shim | extension was successfully linked.
[I 2023-07-06 14:15:08.472 ServerApp] notebook_shim | extension was successfully loaded.
[I 2023-07-06 14:15:08.474 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2023-07-06 14:15:08.491 ServerApp] jupyter_server_proxy | extension was successfully loaded.
[I 2023-07-06 14:15:08.492 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2023-07-06 14:15:08.493 LabApp] JupyterLab extension loaded from /software/projects/pawsey0001/sbeecroft/setonix/python/lib/python3.10/site-packages/jupyterlab
[I 2023-07-06 14:15:08.493 LabApp] JupyterLab application directory is /software/projects/pawsey0001/sbeecroft/setonix/python/share/jupyter/lab
[I 2023-07-06 14:15:08.494 LabApp] Extension Manager is 'pypi'.
[I 2023-07-06 14:15:08.496 ServerApp] jupyterlab | extension was successfully loaded.
[I 2023-07-06 14:15:08.497 ServerApp] Serving notebooks from local directory: /scratch/pawsey0001/sbeecroft/jupyter
[I 2023-07-06 14:15:08.497 ServerApp] Jupyter Server 2.6.0 is running at:
[I 2023-07-06 14:15:08.497 ServerApp] http://nid002024:8888/lab?token=a8135a22fab1a3f97214fa1424eefb25c4e415f6caaab030
[I 2023-07-06 14:15:08.497 ServerApp]     http://127.0.0.1:8888/lab?token=a8135a22fab1a3f97214fa1424eefb25c4e415f6caaab030
[I 2023-07-06 14:15:08.497 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2023-07-06 14:15:08.503 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///home/sbeecroft/.local/share/jupyter/runtime/jpserver-255258-open.html
    Or copy and paste one of these URLs:
        http://nid002024:8888/lab?token=a8135a22fab1a3f97214fa1424eefb25c4e415f6caaab030
        http://127.0.0.1:8888/lab?token=a8135a22fab1a3f97214fa1424eefb25c4e415f6caaab030
[I 2023-07-06 14:15:08.550 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server
[W 2023-07-06 14:22:47.293 ServerApp] 404 GET /apple-touch-icon-precomposed.png (@10.253.128.43) 34.74ms referer=None
[W 2023-07-06 14:22:47.524 ServerApp] 404 GET /apple-touch-icon.png (@10.253.128.43) 1.48ms referer=None

In a separate local terminal window, run SSH based on the command listed in the output file:

ssh -N -f -L 8888:nid002803:8888 <username>@setonix.pawsey.org.au

Supply your Setonix password if requested. Then go to your internet browser and navigate to the Jupyter address (e.g. http://127.0.0.1:8888/lab?token=a8135a22fab1a3f97214fa1424eefb25c4e415f6caaab030 in the above example). This will take you to your Jupyter lab, where you can run Pangeo and Dask.

When selecting the URL to use in your browser, ensure you use the address with 127.0.0.1 and not nidXXXXX.

Example Dask usage from Jupyter

The following is provided as an example of how you might use Dask from within the Jupyter session.

Terminal 3. Example Dask usage in Jupyter

from dask_jobqueue import SLURMCluster

cluster = dask_jobqueue.SLURMCluster(
    cores=24,
    memory='100GB',
    shebang='#!/bin/bash -l',  # default is bash
    processes=6,
    local_directory='/scratch/your_working_dor',
    job_extra_directives=['--account=pawseyXXXX'],  # additional job-specific options
    walltime='02:00:00',
    queue='work',
 ) 
cluster.scale(jobs=2)  # launch 2 jobs, each of which starts 6 worker processes 
cluster.scale(cores=48)  # Or specify cores or memory directly 
cluster.scale(memory="200 GB")  # Or specify cores or memory directly

## Print job script for you to review
print(cluster.job_script())

##Connect the cluster to the notebook
client = Client(cluster)
client
## You should then see the workers spawn and the dashboard start up. You can also check the jobs spawning on the Setonix terminal with `watch squeue -u username -l`

Cleaning up when you are finished

Once you have finished:

Cancel your job with scancel <job_id>.
Kill the SSH tunnel, based on the command displayed in the output file:

kill $( ps x | grep 'ssh.*-L *8888:nid002803:8888' | awk '{print $1}' )

External Links

These external links may be useful for you in making the most of Dask:

More information on how to submit Dask jobs to the Slurm queue via Jupyter, and other related information: https://jobqueue.dask.org/en/latest/examples.html
Talk from SciPy 2019 conference, "Turning HPC Systems into Interactive Data Analysis Platforms" by A. Banihirwe. This gives a tutorial about using Dask on HPC. https://www.youtube.com/watch?v=vhawO8fgD64

How to Run Pangeo and Dask