Profiling with gprof
The gprof
program is a profiling tool found in most Linux environments. It collects performance metrics using sampling techniques as well as instrumentation. It requires your program to be compiled with a supporting compiler such as the GNU compiler.
Prerequisite knowledge
To use gprof
on Pawsey supercomputers you should first be familiar with using it in your own development environment.
Overview
The following is an overview of the process for using gprof
to profile your program:
Enable profiling in your program by compiling and linking with appropriate flags for the compiler.
- (Optional) Set environment variables for MPI programs to set output profile filenames.
- Run your program as usual, and a file containing profiling information will be generated.
- Analyse the output of this file using the
gprof
command.
This page provides a step-by-step example of using the gprof
tool to generate profiling information regarding the performance of a program.
Steps
Compile and link the source code using the
-pg
compiler flag, for example:Compiling with profiling flags$ cc -pg -o example example.c
(Optional) Set the profile output filename prefix in your jobscript if your program uses MPI for parallelism:
Enabling MPI profiling output in your jobscriptexport GMON_OUT_PREFIX=gmon.out srun -n 4 ./example
Run the program to generate the output file(s)
Running the job to generate profiling information$ sbatch batchscript.slurm
View the output with
gprof
. For non-MPI programs use the singlegmon.out
file:Viewing the profiling information$ gprof example gmon.out
For MPI programs, specify a particular
gmon.out.*
file if a particular process is of interests, or all of them for combined output:Viewing MPI generated profiling information$ gprof example gmon.out.*
Result
The final gprof
command will print the profiling information, which can be customised using command line options.
Example
This example profiles a MPI program which calculates the value of pi.
Create a file
darts-mpi.c
that contains the source code:Source code in darts-mpi.c file/* Compute pi using the six basic MPI functions */ #include "lcgenerator.h" #include <mpi.h> #include <stdio.h> static long num_trials = 1000000; int main(int argc, char **argv) { long i; long Ncirc = 0; double pi, x, y; double r = 1.0; // radius of circle double r2 = r*r; int rank, size, manager = 0; MPI_Status status; long my_trials, temp; int j; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); my_trials = num_trials/size; if (num_trials%(long)size > (long)rank) my_trials++; random_last = rank; for (i = 0; i < my_trials; i++) { x = lcgrandom(); y = lcgrandom(); if ((x*x + y*y) <= r2) Ncirc++; } if (rank == manager) { for (j = 1; j < size; j++) { MPI_Recv(&temp, 1, MPI_LONG, j, j, MPI_COMM_WORLD, &status); Ncirc += temp; } pi = 4.0 * ((double)Ncirc)/((double)num_trials); printf("\n \t Computing pi using six basic MPI functions: \n"); printf("\t For %ld trials, pi = %f\n", num_trials, pi); printf("\n"); } else { MPI_Send(&Ncirc, 1, MPI_LONG, manager, rank, MPI_COMM_WORLD); } MPI_Finalize(); return 0; }
Compile and link the source code using the
-pg
compiler flag, for example:Compiling with profiling flags$ cc -pg -o darts-mpi darts-mpi.c
Create a jobscript called
darts-mpi.slurm
and set theGMON_OUT_PREFIX
:Jobscript with GMON_OUT_PREFIX set#!/bin/bash -l #SBATCH --job-name=darts-mpi #SBATCH --nodes=1 #SBATCH --time=00:05:00 export GMON_OUT_PREFIX=gmon.out srun --export=all -n 128 darts-mpi
Run the program to generate the output file(s)
Running the job to generate profiling information$ sbatch darts-mpi.slurm
View the output with
gprof
specifyinggmon.out.*
:Viewing MPI generated profiling information$ gprof gmon.out.*
Next steps
The gprof
output can be customised using command line options.
Refer to the gprof
manual page for more information:
$ man gprof
Pressing q
will exit the manual pages.