HDF5 Examples and Tips

This page contains examples of compiling codes that use HDF5 and some tips for improving performance.

Versions installed in Pawsey systems

To check the current installed versions, use the module avail command (current versions may be different from content shown here):

Terminal 1. Checking for installed versions
$ module avail hdf5
------------------------ /opt/cray/pe/lmod/modulefiles/mpi/gnu/8.0/ofi/1.0/cray-mpich/8.0 -------------------------
   cray-hdf5-parallel/1.12.2.3    cray-hdf5-parallel/1.12.2.7 (D)

--------------------------- /software/setonix/2024.05/modules/zen3/gcc/12.2.0/libraries ---------------------------
   adios2/2.8.3-hdf5       hdf5/1.10.7-parallel-api-v18     hdf5/1.14.3-api-v112
   adios2/2.9.2-hdf5       hdf5/1.10.7-parallel-api-v110    hdf5/1.14.3-parallel-api-v18
   hdf5/1.10.7-api-v18     hdf5/1.14.3-api-v18              hdf5/1.14.3-parallel-api-v110
   hdf5/1.10.7-api-v110    hdf5/1.14.3-api-v110             hdf5/1.14.3-parallel-api-v112 (D)

--------------------------------- /opt/cray/pe/lmod/modulefiles/compiler/gnu/8.0 ----------------------------------
   cray-hdf5/1.12.2.3    cray-hdf5/1.12.2.7 (D)

On Setonix, the HDF5 I/O library is available though Pawsey-supported modules as hdf5/<version>-<build-options>-<api>  or as two sets of Cray-provided modules: cray-hdf5 for serial I/O, and cray-hdf5-parallel for parallel I/O.

We recommend using the Pawsey-supported HDF5 versions.

Compiling HDF5-enabled code

HDF5 libraries provide support for multiple API versions. The API version chosen in the code that uses HDF5 impacts the way you will compile such code; this is true for both serial and parallel I/O HDF5 modules available on Setonix. The following examples assume that the HDF5 module has already been loaded, and use the GCC compilers.

  • If the code uses the latest supported API, no additional compile flags are required. If the code uses an older API use the appropriate Pawsey provided module with the desired API. :

    Terminal 1. Using the latest API to compile HDF5-enabled code
    # load the correct HDF5 library with desired API and whether parallel interface is required. 
    $ module load hdf5/1.14.3-api-v112 # load latest version with latest API
    
    # C code
    $ cc my_hdf5_code.c -lhdf5
    
    # Fortran code
    $ ftn my_hdf5_code.f90 -lhdf5_fortran

Performance tips

Use chunking for non-contiguous reads and writes

For the C and Python APIs of HDF5, data is read and written in row-wise fashion. If you are reading and writing row-wise, your accesses will be contiguous and no changes are necessary. The same holds true if you are reading and writing entire datasets at a time. The HDF5 library is smart enough to use optimal contiguous accesses in this instance. 

However, if you are accessing a dataset column-wise, your accesses will not be contiguous and will incur a performance hit. It is better in this case to introduce chunking into the datasets of interest that will increase the likelihood of more contiguous accesses.

When setting chunk parameters, it is good practice to account for Lustre striping. For instance, a Lustre stripe of 1 MB is used on the HPC filesystems at Pawsey.
Thus, it would be optimal to set the chunk dimensions to hold 1 MB of data or a factor of that amount.

If reading/writing chunked datasets, check the size of the chunk cache

The chunk cache is a memory buffer that HDF5 uses to store copies of chunks read. It does this to maximise reuse of chunks stored in RAM instead of re-reading chunks from disk. When writing chunks, these are first written to the cache first and then flushed to disk when either the cache is full or a read of the specified chunk is requested.

The sizing of this cache can have a dramatic performance effect. If the cache is too small, then the buffering will not improve performance. If the chunk cache is too large, the flushing process can stall read/write performance.

Don't be afraid to use dataset compression

Compression is a good way to reduce storage needs. However, reading and writing compressed data is significantly more expensive compared to uncompressed data. Using optimal chunk dimensions and chunk cache size can minimise this performance overhead.

Related pages

External links

For more information and examples about improving performance when using HDF5 libraries and APIS, see the following external documentation: