Excerpt |
---|
This page describes how to run JupyterLab in a container on Pawsey systems with Slurm. This involves launching JupyterLab and then connecting to the Jupyter server. |
...
For this example, we're going to be using the jupyter/datascience-notebook (external site) Docker image. It provides a Conda environment with a large collection of common Python packages (including NumPy, SciPy, Pandas, Scikit-learn, Bokeh and Matplotlib), an R environment (with the tidyverse (external site) packages), and a Julia environment. All of these are accessible via a Jupyter notebook server.
This Docker image ships with a startup script that allows for a number of runtime options to be specified. Most of these are specific to running a container using Docker; we will focus on how to run this container using Singularity.
The datascience-notebook
image has a default user, jovyan
, and it assumes that you will be able to write to /home/jovyan
. When you run a Docker container via Singularity, you will be running as your Pawsey username inside the container, so we won't be able to write to /home/jovyan
. Instead, we can mount a specific directory (on Pawsey's filesystems) into the container at /home/jovyan
. This will allow our Jupyter server to do things like save notebooks and write checkpoint files, and those will persist on Pawsey's filesystem after the container has stopped.
...
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | Emacs |
---|
title | Listing 1. Slurm script for running JupyterHub in a GPU-enabled container |
---|
collapse | true |
---|
| #!/bin/bash -l
# Allocate slurm resources, edit as necessary
#SBATCH --account=[your-project-name]
# Here we request the appropriate partition for the system
#SBATCH --partition=work
# Since jupyterlab is not mpi enabled, we just use one task
#SBATCH --ntasks=1
#SBATCH --mem=20GB
#SBATCH --time=02:00:00
#SBATCH --job-name=jupyter_notebook
#SBATCH --export=NONE
# Set our working directory
# This is the directory we'll mount to /home/jovyan in the container
# Should be in a writable path with some space, like /scratch
jupyterDir="${MYSCRATCH}/jupyter-dir"
# Set the image and tag we want to use
image="docker://jupyter/datascience-notebook:latest"
# You should not need to edit the lines below
# Prepare the working directory
mkdir -p ${jupyterDir}
cd ${jupyterDir}
# Get the image filename
imagename=${image##*/}
imagename=${imagename/:/_}.sif
# Get the hostname
# We'll set up an SSH tunnel to connect to the Juypter notebook server
host=$(hostname)
# Set the port for the SSH tunnel
# This part of the script uses a loop to search for available ports on the node;
# this will allow multiple instances of GUI servers to be run from the same host node
port="8888"
pfound="0"
while [ $port -lt 65535 ] ; do
check=$( ss -tuna | awk '{print $4}' | grep ":$port *" )
if [ "$check" == "" ] ; then
pfound="1"
break
fi
: $((++port))
done
if [ $pfound -eq 0 ] ; then
echo "No available communication port found to establish the SSH tunnel."
echo "Try again later. Exiting."
exit
fi
# Load Singularity
module load singularity/3.11.4-nompi
# Pull our image in a folder
singularity pull $imagename $image
echo "*****************************************************"
echo "Setup - from your laptop do:"
echo "ssh -N -f -L ${port}:${host}:${port} $USER@$PAWSEY_CLUSTER.pawsey.org.au"
echo "*****"
echo "The launch directory is: $jupyterDir"
echo "*****************************************************"
echo ""
echo "*****************************************************"
echo "Terminate - from your laptop do:"
echo "kill \$( ps x | grep 'ssh.*-L *${port}:${host}:${port}' | awk '{print \$1}' )"
echo "*****************************************************"
echo ""
# Launch our container
# and mount our working directory to /home/jovyan in the container
# and bind the run time directory to our home directory
singularity exec -C \
-B ${jupyterDir}:/home/joyvan \
-B ${jupyterDir}:${HOME} \
${imagename} \
jupyter notebook \
--no-browser \
--port=${port} --ip=0.0.0.0 \
--notebook-dir=${jupyterDir} |
|
...
Column |
---|
|
Code Block |
---|
language | py |
---|
theme | Emacs |
---|
title | Listing 3. Simple GPU-enabled Python code snippet |
---|
collapse | true |
---|
| # key GPU library
from numba import cuda
import numpy as np
# define some kernels
@cuda.jit
def add_kernel(x, y, out):
idx = cuda.grid(1)
out[idx] = x[idx] + y[idx]
n = 4096
x = np.arange(n).astype(np.int32) # [0...4095] on the host
y = np.ones_like(x) # [1...1] on the host
out = np.zeros_like(x)
# cuda commands to copy memory to the device
d_x = cuda.to_device(x)
d_y = cuda.to_device(y)
d_out = cuda.to_device(out)
# run kernel
threads_per_block = 128
blocks_per_grid = 32
add_kernel[blocks_per_grid, threads_per_block](d_x, d_y, d_out)
cuda.synchronize()
# output result
print(d_out.copy_to_host()) # Should be [1...4096] |
|
External links
- DockerHub
- For information about runtime options supported by the startup script in the Jupyter image, see Common Features in the Jupyter Docker Stacks documentation
- The Rocker Project ("Docker Containers for the R Environment")