Using Boltz v.1.0.0 on AMD GPUs

Using Boltz v.1.0.0 on AMD GPUs

Boltz is a machine learning-based protein structure prediction tool that can predict structures for proteins, nucleic acids, and small molecules. This guide covers how to use the AMD-optimized version of Boltz on the Setonix supercomputer at Pawsey.

Prerequisites

  • A Pawsey account with GPU allocation

  • Input files for Boltz

Job Script Template

This is an example sbatch script for running on a single GCD.

Boltz is able to use more than one GCD, so look at the information here to adapt this template script to allocate multiple GCDs: https://pawsey.atlassian.net/wiki/spaces/US/pages/51928618

#!/bin/bash -l #SBATCH --account=${PAWSEYPROJECT}-gpu #SBATCH --partition=gpu #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --gres=gpu:1 #SBATCH --time=02:00:00 #SBATCH --job-name=boltz_prediction # Load required modules module load pawseyenv/2023.08 module load singularity/3.11.4-nompi cd $MYSCRATCH/boltz/inputs # Set container containerImage=docker://quay.io/pawsey/boltz-1x:rocm6.3.4 # Set input dir INPUTDIR=$MYSCRATCH/boltz/inputs # Set output directory OUTDIR=$MYSCRATCH/boltz/${SLURM_JOB_ID} mkdir -p ${OUTDIR} # Set cache directory CACHEDIR=$MYSCRATCH/boltz/cache mkdir -p ${CACHEDIR} # Set numba cache dir export numba_cache_dir=$MYSCRATCH/numba_cache_dir/${SLURM_JOB_ID} mkdir -p ${numba_cache_dir} # Run Boltz prediction with bind mounting for the directories we need srun -N 1 -n 1 -c 8 --gres=gpu:1 \ singularity exec \ -B ${numba_cache_dir} \ -B ${INPUTDIR} \ -B ${OUTDIR} \ -B ${CACHEDIR} \ ${containerImage} boltz predict \ ${INPUTDIR}/file.fasta \ --cache ${CACHEDIR} \ --use_msa_server \ --out_dir ${OUTDIR}

Before Running

  1. Modify the paths to INPUTDIR, OUTDIR, CACHEDIR and numba_cache_dir as required.

  2. Replace ${PAWSEYPROJECT} with your project code.

  3. Consider pulling the container image ahead of time using the work partition (key command being singularity pull docker://quay.io/pawsey/boltz-1x:rocm6.3.4). Otherwise, this script will pull the container when you run for the first time, which will add some time to your job and use up some of your GPU allocation.