Skip to end of banner
Go to start of banner

How to Manually Build Software

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 5 Next »

When the installation of software is not supported by the Spack software manager you will have to go through every step of the associated build process yourself.This page provides guidance on how to build software to generate an executable from its source code when the build process is supported by the tools Make, CMake or configure script.

Building with Spack is faster and chances are the build will be optimised for the supercomputer hardware.

You should only attempt to build a software application manually if Spack does not support it, or if you have a good reason to do so, for example compiling with an uncommon option enabled. Check also the Compiling and Compiler Optimisation Levels pages for common options used for supercomputing applications.

Prerequisites

To begin with, retrieve the source code and save it on the supercomputer.

Remember that the build process must happen on the type of compute node that the code will execute on, in order to take advantage of all the optimisations for that particular architecture.

Context

The build of scientific software is the process of generating a functioning executable program from its source code. A build follows a standard sequence of steps:

  1. Environment configuration. The software you want to build and the build process itself almost always depend on libraries, tools and supporting files being present on the system. You must ensure that all required dependencies are available and discoverable through appropriate mechanisms such as environment variables.
  2. Compiling. This is the process of transforming the source code of the software into machine code, which is stored in object files. This task is performed by compilers such as gcc.
  3. Linking. Library dependencies and the object files produced by the previous steps are linked together to form an executable.
  4. Installation. Executables and other necessary artifacts, like shared or static libraries, are moved to the desired installation location on the /software filesystem, where they can be found for later use.

Major tools supporting the process are few and well established. Here is a list of them.

  1. GNU Make is the de facto standard build tool for software projects developed on and for Linux environments. It relies on a makefile, by default named Makefile, which contains the rules (including the sequence of commands, environment variables, options) that tell Make how to generate an executable from a source code.
  2. The configure script is often used to automate the retrieval of system and user information before the compilation and linking steps are performed. It generates a tailored Makefile starting from a general template and the collected information.
  3. CMake, not to be confused with GNU Make, is a meta-build tool that uses system-independent and compiler-independent configuration files to generate specific build scripts for a range of system-specific build tools, including GNU Make.

Steps

Recommended location for manual software builds

To keep your software organised, we recommend the following locations for your manual software installations:

  • Software: /software/projects/<project-id>/<user-name>/manual/software/
  • Modulefiles: /software/projects/<project-id>/<user-name>/manual/modules/

The process of building software starts with obtaining and unpacking its source code into a directory, which from now on is referred to as $ROOT_DIR.

  1. Identify the build process. Usually, the software package provides a detailed description of the build process. Otherwise, you should look for one of the following files indicating which tools are used for the purpose.

    1. A CMakeLists.txt file in the $ROOT_DIR directory indicates a CMake project. Go to step 3.
    2. A configure or configure.sh script in the $ROOT_DIR directory suggests that you must execute a script to configure the build process. Go to step 2.
    3. A Makefile in the $ROOT_DIR directory signals that the project's build process is handled through GNU Make. Go to step 4.
  2. configure  scripts. A build process uses a configure script to collect information regarding the environment (operating system, compilers, libraries, etc.) you intend to build the software in. It is able to collect most of the needed information and decide on the best configuration automatically. However, there are few options that you must usually set; for instance, the --prefix option is used to specify the absolute path, the path starting from the root of the filesystem, to the installation directory. There may be options that are not required but desirable in a supercomputing environment, for example, options enabling vectorised instructions. Once the script has run, typically GNU Make must be executed next (step 4).

     Show configure usage examples ...

    To see the list of all options and arguments, execute the configure script with the –help option,

        $ ./configure --help

    As an example, the following line shows how to run a configure script specifying the --prefix option:

        $ ./configure --prefix=/path/to/installation/dir

    You can also set the value of the environment variables used by the configure script, like this:

        $ ./configure VAR=VALUE

     List of most common compiling a linking variables
    VariableMeaningExample
    CC C compilerCC=icc 
    CXX C++ compilerCXX=icpc 
    FCFortran compilerFC=ifort
    CFLAGS C compiler flagsCFLAGS=-O2 
    CXXFLAGS C++ compiler flags CXXFLAGS=-O2
    FCFLAGS Fortran compiler flags FCFLAGS=-O2
    CPPFLAGS C/C++ preprocessor flags CPPFLAGS=-I<include dir>
    LDDFLAGS Linker flags LDDFLAGS=-L<library dir>
    LIBSLinker libraries LIBS=-l<lib name>

    Notes and best practices. Always check the output produced by a configure script. It might contain warnings that call for a modification of the configure options or the shell environment. Some configure scripts compile a test code and execute it, to set some compilation options accordingly. This may not work in a cross-compilation environment such as the one on some Cray supercomputers. For instance, Cray XC40 login nodes do not have the Aries interconnect and so testing for MPI might fail. Another example is testing for GPU computing capability of a GPU cluster on login nodes, which might not have GPUs. If you encounter this, try running the build process on a compute node.

  3. Building using CMake. Similarly to configure scripts, CMake generates one or more environment-dependent build files (for Linux-based system they are Makefiles files, covered in step 4) from a high-level, environment-independent definition of the build process that is contained in the CMakeLists.txt file. Terminal 1 shows the typical sequence of commands you should use. Once completed, move to step 4.

    Terminal 1. Using CMake to generate build files
    $ cd $ROOT_DIR
    $ mkdir build
    $ cd build
    $ sg <projectcode> -c 'cmake ..'

    In words,

    1. Change the working directory of the terminal to $ROOT_DIR and create a directory named build (the name can vary, although the one suggested here is standard practice) within the same.

    2. Move again the terminal, this time to the newly created folder.

    3. From the build directory, execute the command cmake passing as an argument the path to the directory containing the CMakeLists.txt file. Typically the relative path .. is used.

    Like the configure script, you can specify options to CMake. The most common one is the CMAKE_INSTALL_PREFIX option that dictates where binaries will be installed (the default location being /usr/local). The syntax for specifying an option to CMake is -DOPTION=Value. In this case, the command would look like this:

        $ cmake -DCMAKE_INSTALL_PREFIX=/path/to/installation/directory ..

  4. Building using GNU Make. Conceptually, GNU Make executes commands to compile and link a program specified in the Makefile file, using a dedicated syntax that allows declaring dependencies between the building steps. To launch the build process, change the working directory of the terminal to the one containing the Makefile (that is, $ROOT_DIR) and simply execute the make command. Next, execute the make install command to install the built executable or library.

        $ sg <projectcode> -c 'make'
        $ sg <projectcode> -c 'make install'

    The install argument to make is called target. A target represents a subset of the Makefile file that accomplishes a particular task in the larger context of the build process. In terminal 4, the first make command executes the default target, which usually builds the software without installing it. The install target installs the binaries, that is, the produced executables or libraries.
    Sometimes you must change the value of some variables defined in the Makefile file. Some variable names are standard across most Makefile files. In particular, CCCXX and FC are used to define executable names for C, C++ and Fortran compilers, respectively, whereas CFLAGSCXXFLAGS and FFLAGS are used for the corresponding compiling flags.

    All the compiler modules in Pawsey HPC systems define the compiler variables CCCXX and FC, which are then ready to use by GNU Make.

Result

The software you have built is now located at the installation path. See the Next Steps section for what to do next in order to use it.

Example

This example shows how to build gromacs/2021.4 on Setonix using CMake. Although the application is available through Spack, sometimes users need a custom build with particular patches or flags.

  1. Login to Setonix, then move to your /software  folder and download the source code of Gromacs. See Software Stack for more information on the organisation of software on Setonix.
    $ cd /software/projects/<project-id>/<user-name>/manual/software
    $ wget https://gitlab.com/gromacs/gromacs/-/archive/v2021.4/gromacs-v2021.4.tar.gz
  2. Request an interactive session on a compute node, with 64 CPU cores to enable a parallel build. Alternatively, you can write a build script and submit the job to the scheduler.
    $ salloc -p work --ntasks=1 -c 64
  3. Extract the source code from the archive, then execute the build process.

    Terminal 2. Building gromacs using CMake
    $ tar -xf gromacs-v2021.4.tar.gz
    $ mkdir gromacs-v2021.4/build
    $ cd gromacs-v2021.4/build
    $ module load cray-fftw
    $ module load cray-mpich
    $ sg <projectcode> -c 'cmake -DCMAKE_INSTALL_PREFIX=$MYSOFTWARE/gromacs_manual_build -DGMX_MPI=ON ..'
    [ output ... ]
    $ sg <projectcode> -c 'make -j 64'
    [ output ... ]
    $ sg <projectcode> -c 'make install'
    [ output ... ]

Next steps

Once you have installed your software, you may need to set some environment variables so that the operating system can find the software and its dependencies. The environment variables are typically PATH, LD_LIBRARY_PATH  and LIBRARY_PATH.

You may want to create a module for your software to modify your environment easily. Check Modules for more information.

Related pages

External links

  • No labels