High Performance Libraries
High Performance libraries include third-party numerical and I/O libraries that collect optimised versions of common algorithms.Their use is strongly recommended on Pawsey supercomputers, as it saves developers the time to rewrite popular algorithms by providing implementations that are likely more efficient and performant.
Many commonly used libraries are available on Pawsey systems in their standard form as modules.
Numerical Libraries
Cray-Libsci
Cray provides a cray-libsci module that provides a set of numerical libraries: BLAS, LAPACK and SCALAPACK.
The current version of this library has a bug in BLAS. Therefore we do not recommend using this module. As this module is loaded by default whenever a programming environment is loaded or swapped to we advise users to remove it and load an equivalent Pawsey-supported library (OpenBLAS, Netlib)
$ module unload cray-libsci $ # load either openblas $ module load openblas/<version> $ # or netlib $ module load netlib-lapack/<version> netlib-scalapack/<version>
OpenBLAS
OpenBLAS provides a set of numerical libraries: BLAS, LAPACK and SCALAPACK.
- BLAS: Basic Linear Algebra Subprograms is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations and matrix multiplication. Routines have bindings for both C ("CBLAS interface") and Fortran ("BLAS interface").
- LAPACK: Linear Algebra Package provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It also includes routines to implement the associated matrix factorizations such as LU, QR, Cholesky and Schur decomposition. LAPACK is written in Fortran and handles both real and complex matrices in both single and double precision.
- SCALAPACK: Scalable LAPACK library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is currently written in an SPMD (single program, multiple data) style using explicit message passing for interprocessor communication. It assumes matrices are laid out in a two-dimensional block-cyclic decomposition.
Netlib
Netlib provides the LAPACK anad SCALAPACK numerical libraries.
On Setonix, these are available as netlib-lapack/<version>
and netlib-scalapack/<version>
modules.
FFTW
FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, that is, the discrete cosine transforms (DCTs) and discrete sine transforms (DSTs).
On Setonix, both FFTW 3.x and FFTW 2.x are available as modules on Setonix and Garrawarla. There is also a Cray provided FFTW 3.x module (cray-
fftw) available on Setonix. There is also and AMD Zen3 optimised version on Setonix provided by the amdfftw/<version>
module.
AMD Optimised Numercal Librarires
On Setonix, there are AMD Zen3 optimised libraries for BLAS, LAPACK, SCALAPACK, FFTW, and even the basic Math library libm.
- BLIS: provides BLAS-like dense linear algebra libraries. Available as the
amdblis/<version>
module. - libFLAME: library for dense matrix computations, providing much of the functionality present in LAPACK. It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation. Available as the
amdlibflame/<version>
module. - SCALAPACK: provides SCALAPACK library. Available as the
amdscalapack/<version>
module. - FFTW: provides FFTW. Available as the
amdfftw/<version>
module. - libm: library containing a collection of basic math functions optimized for x86-64 processor-based machines. It provides many routines from the list of standard C99 math functions. Available as the
amdlibm/<version>
module. - AOCL-Sparse: library for basic linear algebra of sparse matrices and vectors optimized for Zen3. It is designed to be used with C and C++. Current functionality of sparse library supports SPMV function with CSR and ELLPACK formats. Available as the aocl-
sparse/<version>
module.
Eigen
Eigen is a C++ template library for linear algebra matrices, vectors, numerical solvers, and related algorithms.
On Setonix, Eigen is available as a module.
PETSc
PETSc (pronounced PET-see) is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It supports MPI, and GPUs through CUDA or OpenCL, as well as hybrid MPI-GPU parallelism; a HIP version is currently under development. PETSc (sometimes called PETSc/Tao) also contains the Tao optimization software library.
On Setonix, PETSc is available as module.
Trilinos
The Trilinos project is an effort to develop algorithms and enabling technologies within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific problems. A unique design feature of Trilinos is its focus on packages.
Each Trilinos package is a self-contained, independent piece of software with its own set of requirements, its own development team and group of users. Because of this, Trilinos itself is designed to respect the autonomy of packages. Trilinos offers a variety of ways for a particular package to interact with other Trilinos packages. It also offers a set of tools that can assist package developers with builds across multiple platforms, generating documentation and regression testing across a set of target platforms. At the same time, what a package must do to be called a Trilinos package is minimal, and varies with each package. The list of all packages can be found on the external list of Trilinos packages.
On Setonix, Trilinos is available as a module.
Slate
The SLATE (Software for Linear Algebra Targeting Exascale) project provides basic dense matrix operations (e.g., matrix multiplication, rank-k update, triangular solve), linear systems solvers, least square solvers, singular value and eigenvalue solvers. The goal is to replace SCALAPACK.
On Setonix, SLATE is available as a module.
Plasma
The PLASMA (Parallel Linear Algebra for Scalable Multi-core Architectures) project provides libraries to solve dense general systems of linear equations, symmetric positive definite systems of linear equations and linear least squares problems, using LU, Cholesky, QR and LQ factorizations. Real arithmetic and complex arithmetic are supported in both single precision and double precision. The goal is to address shortcomings in the standard LAPACK and SCALAPACK libraries.
On Setonix, PLASMA is available as a module.
I/O libraries
HDF5
HDF5 (Hierarchical Data Format 5) is a library package outlining specific layouts, access patterns and management processes of output files. HDF5 can effectively store and access datasets of tens to hundreds of gigabytes. It has comprehensive tools and APIs that allow for very efficient serial and parallel access to data. It is a widely adopted data format that is supported in many tools and applications.
On Setonix, a wide variety of builds are available providing serial and parallel builds for multiple API versions and have module names of hdf5/<version>-api-<api_version>
and hdf5/<version>-parallel-<api_version>
for serial builds with C++ and parallel builds respectively. There are also two sets of Cray-provided modules on Setonix: cray-hdf5
for serial I/O, and cray-hdf5-parallel
for parallel I/O which only provide version 1.12.x with the v112 API. A set of standard modules for HDF5 is also available Garrawarla, providing serial and parallel builds, builds compatible with the GPU software stack, and multiple API versions.
h5py is the most popular package for making use of HDF5 in Python. This package provides a simple but comprehensive interface to most of the features of the HDF5 library. More information can be found on the HDF5 for Python homepage including the HDF5 for Python user manual , which contains comprehensive documentation about the Python API used in h5py.
Restriction
h5py is available as a standard module on Setonix and Garrawarla but, at the time of writing, only serial I/O is supported.
The Learning HDF5 page is maintained by the HDF group. It includes some extremely useful tutorials, examples and videos.
See HDF5 Examples and Tips for examples on compiling codes that use HDF5 and some performance tips.
NetCDF
NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely-distributed collection of data access libraries for C, Fortran, C++, Java, and other languages. The NetCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data.
On Setonix, NetCDF is available in a variety of flavours: standard C interface (netcdf-c/<version>
); C++ interface (netcdf-cxx4/<version>
); C++ interface to NetCDF-4 files (netcdf-cxx/<version>
); Fortran interface (netcdf-fortran/<version>
); and parallel NetCDF (parallel-netcdf/<version>
). We recommend using these modules. Also available are two sets of Cray-provided modules: cray-netcdf
for serial I/O, and cray-netcdf-hdf5parallel
for parallel I/O.
ADIOS2
ADIOS2 (Adaptable Input Output (I/O) System) is an open-source framework that provides scalable parallel I/O. ADIOS 2 bindings are available in C++, C, Fortran, Python.
On Setonix, we provide the ADIOS2 built with HDF5 support as a standard module.
Parallel Libraries
Boost
Boost is a set of libraries for the C++ programming language that provide support for tasks and structures such as linear algebra, pseudorandom number generation, multithreading, image processing, regular expressions and unit testing. It contains over eighty individual libraries.
Boost is available as a standard module on Setonix and Garrawarla.
HPX
HPX is a C++ Standard Library for Concurrency and Parallelism. It implements all of the corresponding facilities as defined by the C++ Standard. Additionally, in HPX we implement functionalities proposed as part of the ongoing C++ standardization process.
On Setonix, HPX is available as a module.
Kokkos
Kokkos is a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources. It currently can use CUDA, HIP, SYCL, HPX, OpenMP and C++ threads as backend programming models with several other backends in development.
On Setonix, Kokkos is available in two flavours: one using HPX and the other using OpenMP as kokkos/<version>-hpx
and kokkos/<version>-openmp
respectively.
Charm++
Charm++ is a parallel programming framework in C++ supported by an adaptive runtime system, allowing programs to run portably from small multicore jobs to the largest supercomputers. It is used by codes such as NAMD to provide parallelism.
On Setonix, Charm++ is available as a module, charmpp/<version>
.
Related pages
- For information on listing and loading modules, see Modules.