/
GROMACS

GROMACS

GROMACS is a versatile package for performing molecular dynamics, for example, simulating the Newtonian equations of motion for systems with hundreds to millions of particles.

On this page:

GROMACS is primarily designed for biochemical molecules like proteins, lipids, and nucleic acids that have a lot of complicated bonded interactions. Because GROMACS is extremely fast at calculating the nonbonded interactions that usually dominate simulations, many groups are also using it for research on non-biological systems, for example, polymers.

Versions installed in Pawsey systems

To check the current installed versions, use the module avail command (current versions may be different from content shown here):

Terminal 1. Checking for installed versions
$ module avail gromacs
------------------------- /software/setonix/2024.05/modules/zen3/gcc/12.2.0/applications --------------------------
   gromacs-amd-gfx90a/2023    gromacs/2022.5-mixed    gromacs/2023-mixed (D)
   gromacs/2022.5-double      gromacs/2023-double

Modules with the -amd-gfx90a suffix support GPU offloading and are meant to be used within the gpu partition. mixed means mixed precision and double means double precision installations.

GROMACS is compiled with the GNU programming environment.

All GROMACS installations on Setonix have been patched with Plumed.

Example: Running GROMACS on CPU

This is an example of a GROMACS job queueing script. As an example, we used the benchMEM.tpr benchmark case that can be found at the following page: A free GROMACS benchmark set (external site).

Listing 1. Example of a job queueing script using GROMACS Test Case A
#!/bin/bash --login
#SBATCH --nodes=1
#SBATCH --ntasks=128
#SBATCH --exclusive
#SBATCH --time=00:05:00
#SBATCH --account=[your-project]

module load gromacs/2023-double

export OMP_NUM_THREADS=1

srun -N 1 -n 128 gmx_mpi_d mdrun -s benchMEM.tpr

For more information on how to run jobs on the CPU partitions see: Example Slurm Batch Scripts for Setonix on CPU Compute Nodes.

Running GROMACS on GPUs

GROMACS supports GPU offloading of some of the operations to GPU. Acceleration is officially supported using the SYCL standard. Additionally, AMD staff maintains its own GROMACS GPU implementation using HIP. The AMD HIP-port is often faster than SYCL, but is not officially supported by the GROMACS core developers and lags behind the official release on features. The version currently installed on Setonix, gromacs-amd-gfx90a/2023, is the AMD port.

GPU offloading can be enabled with the following options:

  • -pme gpu: compute long-range interactions on GPU,
  • -pme 1: dedicate a single MPI task to PME calculations, which can improve performance when running with two GCDs. Note that the module on Setonix does not support more than one pme rank when offloading to GPUs; this can negatively impact performance of multi-gpu calculations with more than two GCDs due to load-imbalances between PP and PME components of the calculation.
  • -nb gpu: compute non bonded interactions on GPUs.
  • -bonded gpu: compute bonded interactions on GPUs. This option is not always available, GROMACS will print an error message when the operation cannot be performed on GPU.
  • -update gpu: compute constraints and update on GPUs. This option is not always available, GROMACS will print an error message when the operation cannot be performed on GPU.

More information can be found in the following page in the GROMACS documentation: Running mdrun with GPUs (external site).

As an example, we used the benchMEM.tpr benchmark case that can be found at the following page: A free GROMACS benchmark set (external site). Here is a very simple submission script.

Listing 2. A sample batch script to submit a GROMACS job to GPU.
#!/bin/bash

#SBATCH --nodes=1
#SBATCH --gres=gpu:1
#SBATCH --partition=gpu
#SBATCH --account=[your-project]-gpu

module load craype-accel-amd-gfx90a
module load gromacs-amd-gfx90a/2023

srun gmx_mpi mdrun -nb gpu -bonded gpu -ntomp 8 -s benchMEM.tpr

For more information on how to run jobs on the GPU partitions see Example Slurm Batch Scripts for Setonix on GPU Compute Nodes.

Multi-GPU calculations

Multi-GPU calculations are supported by GROMACS on Setonix. Running with one MPI rank per GPU typically gives good performance. Running with one MPI task and multiple OpenMP threads per GCD usually gives good performance. GROMACS is able to take advantage of the srun options --gpus-per-task=1 and --gpu-bind=closest, which will ensure optimal binding of GCDs to their directly connected chiplet on the CPU, as described in Example Slurm Batch Scripts for Setonix on GPU Compute Nodes.


There are some general limitations to the multi-GPU performance:

  • GROMACS's hardware report (printed in the log file) has a known limitation where it only reports a single GCD as being "visible" to the main MPI rank, even though the calculation will actually be utilising multiple GCDs. This means that the log file will report "1 GPU selected for this run", even though it is actually utilising all the GCDs requested in the job script.
  • The gromacs-amd-gfx90a/2023 module on Setonix is limited to one dedicated PME rank when offloading PME to the GPU. This means that the time taken by the PME component of the simulation will stay constant as more GPUs are added, degrading performance when running with more than 2 GCDs. GROMACS will print a warning to the log file if the PP–PME workload imbalance becomes significant.
  • GROMACS's dynamic load-balancing has limited support for fully-GPU-resident calculations. GROMACS will print a warning to the log file if there is a substantial workload imbalance between MPI ranks, which can be a good starting point when configuring future calculations.

While it is possible to use multiple GPUs in parallel, the above limitations mean there may be limited benefit to running with more than two GCDs.


Here is a sample multi-GPU submission script for the benchMem.tpr benchmark, using two GCDs and 16 CPU cores. One MPI rank is dedicated to the PME component and the other calculates the particle-particle (PP) and non-bonded (NB) interactions, and updates the particle positions:

Listing 3. A sample batch script to submit a GROMACS job using multiple GPUs.
#!/bin/bash

#SBATCH --nodes=1
#SBATCH --gres=gpu:2
#SBATCH --partition=gpu
#SBATCH --account=[your-project]-gpu

module load craype-accel-amd-gfx90a
module load gromacs-amd-gfx90a/2023

export MPICH_GPU_SUPPORT_ENABLED=1 #This allows for GPU-aware MPI communication among GPUs

srun --gpus-per-task=1 --gpu-bind=closest gmx_mpi mdrun -nb gpu -bonded gpu -ntomp 8 -s benchMEM.tpr

For more information on how to run jobs on the GPU partitions see Example Slurm Batch Scripts for Setonix on GPU Compute Nodes.

External links

Related content