Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Excerpt

This page describes how to run JupyterLab in a container on Pawsey systems with Slurm. This involves launching JupyterLab and then connecting to the Jupyter server.

...

For this example, we're going to be using the jupyter/datascience-notebook (external site) Docker image. It provides a Conda environment with a large collection of common Python packages (including NumPy, SciPy, Pandas, Scikit-learn, Bokeh and Matplotlib), an R environment (with the tidyverse (external site) packages), and a Julia environment. All of these are accessible via a Jupyter notebook server.

This Docker image ships with a startup script that allows for a number of runtime options to be specified. Most of these are specific to running a container using Docker; we will focus on how to run this container using  Singularity.

The datascience-notebook image has a default user, jovyan, and it assumes that you will be able to write to /home/jovyan. When you run a Docker container via Singularity, you will be running as your Pawsey username inside the container, so we won't be able to write to /home/jovyan. Instead, we can mount a specific directory (on Pawsey's filesystems) into the container at /home/jovyan. This will allow our Jupyter server to do things like save notebooks and write checkpoint files, and those will persist on Pawsey's filesystem after the container has stopped.

...

  • Use of the gpu partition on Setonix
  • Request a GPU to Slurm
  • Pass the environment variable ROCM_HOMEPATH to Singularity
  • Run the container using the flag  --rocm, to enable the GPU support from Singularity

...

Column
width900px


Code Block
languagebash
themeEmacs
titleListing 2. Slurm script for running JupyterHub in a GPU-enabled container
collapsetrue
#!/bin/bash -l
# This example is for GPUs on Setonix
# Allocate slurm resources, edit as necessary
#SBATCH --account=[your-project-name]
# Here we request the appropriate GPU partition on a system
#SBATCH --partition=gpu
# Be aware that the request for GPU resources may change in later versions of slurm
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1
gres=gpu:1  #Asking for a single GPU-pack (Read documentation about running GPU jobs)
#SBATCH --time=02:00:00
#SBATCH --job-name=jupyter_notebook
#SBATCH --export=NONE

# Set our working directory
# This is the directory we'll mount to /home/jovyan in the container
# Should be in a writable path with some space, like /scratch
jupyterDir="${MYSCRATCH}/jupyter-dir"

# Set the image and tag we want to use
image="docker://jupyter/datascience-notebook:latest"

# You should not need to edit the lines below

# Prepare the working directory
mkdir -p ${jupyterDir}
cd ${jupyterDir}

# Get the image filename
imagename=${image##*/}
imagename=${imagename/:/_}.sif

# Get the hostname
# We'll set up an SSH tunnel to connect to the Juypter notebook server
host=$(hostname)

# Set the port for the SSH tunnel
# This part of the script uses a loop to search for available ports on the node;
# this will allow multiple instances of GUI servers to be run from the same host node
port="8888"
pfound="0"
while [ $port -lt 65535 ] ; do
  check=$( ss -tuna | awk '{print $4}' | grep ":$port *" )
  if [ "$check" == "" ] ; then
    pfound="1"
    break
  fi
  : $((++port))
else

 echo "Port to use is port=${port}"
done
if [ $pfound -eq 0 ] ; then
  echo "No available communication port found to establish the SSH tunnel."
  echo "Try again later. Exiting."
  exit
else
  echo "Port to use is port=${port}"
fi

# Load Singularity
module load singularity/4.1.0-nompi

# Load ROCm and set environment variable for Singularity
module load rocm/5.27.3
export SINGULARITYENV_CUDAROCM_HOMEPATH=$CUDA$ROCM_HOMEPATH

# Pull our image in a folder
singularity pull $imagename $image

echo "*****************************************************"
echo "Setup - from your laptop do:"
echo "ssh -N -f -L ${port}:${host}:${port} $USER@$PAWSEY_CLUSTER.pawsey.org.au"
echo "*****"
echo "The launch directory is: $jupyterDir"
echo "*****************************************************"
echo ""
echo "*****************************************************"
echo "Terminate - from your laptop do:"
echo "kill \$( ps x | grep 'ssh.*-L *${port}:${host}:${port}' | awk '{print \$1}' )"
echo "*****************************************************"
echo ""
 
# Launch our container
# and mount our working directory to /home/jovyan in the container
# and bind the run time directory to our home directory
singularity exec --nvrocm -C \
  -B ${jupyterDir}:/home/joyvan \
  -B ${jupyterDir}:$HOME \
  ${imagename} \
  jupyter notebook \
  --no-browser \
  --port=${port} --ip=0.0.0.0 \
  --notebook-dir=${jupyterDir}


...

Column
width900px


Code Block
languagepy
themeEmacs
titleListing 3. Simple GPU-enabled Python code snippet
collapsetrue
# key GPU library
from numba import cuda
import numpy as np

# define some kernels
@cuda.jit
def add_kernel(x, y, out):
    idx = cuda.grid(1)
    out[idx] = x[idx] + y[idx]

n = 4096
x = np.arange(n).astype(np.int32) # [0...4095] on the host
y = np.ones_like(x)               # [1...1] on the host
out = np.zeros_like(x)

# cuda commands to copy memory to the device 
d_x = cuda.to_device(x)
d_y = cuda.to_device(y)
d_out = cuda.to_device(out)

# run kernel
threads_per_block = 128
blocks_per_grid = 32
add_kernel[blocks_per_grid, threads_per_block](d_x, d_y, d_out)
cuda.synchronize()

# output result 
print(d_out.copy_to_host()) # Should be [1...4096]


External links

  • DockerHub
  • For information about runtime options supported by the startup script in the Jupyter image, see Common Features in the Jupyter Docker Stacks documentation
  • The Rocker Project ("Docker Containers for the R Environment")