Vectorisation
Vectorisation is a feature of modern processors that can execute a single instruction on multiple data objects in parallel within a single core.
Adapting your application to take advantage of vectorisation is important for achieving high performance on today's computer architectures.
Prerequisite knowledge
In order to get started with vectorisation you should first be familiar with how to install and run your program, including knowing how to update compilation flags for compiled languages. You should also be familiar with the programming language used in your program.
Vectorisation example
To demonstrate the process of vectorisation, we will start with a simple loop that adds entries of two vectors a and b and stores the results in vector c.
do i=1,1024 c(i)=a(i)+b(i) end do
In the first phase the loop is unrolled into the statements shown in listing 2.
do i=1,1024,4 c(i)=a(i)+b(i) c(i+1)=a(i+1)+b(i+1) c(i+2)=a(i+2)+b(i+2) c(i+3)=a(i+3)+b(i+3) end do
Now the loop can be transformed into its vector equivalent.
do i=1,1024,4 load into vector registers vec_a and vec_b vec_c = vec_a + vec_b store from vector register vec_c end do
In the vectorised loop shown in listing 3 the four consecutive entries of vectors a and b are first loaded into their corresponding vector registers vec_a and vec_b. The vector addition instruction is then executed and the result is written to vec_c. Four resulting entries are stored in appropriate addresses in c.
The vector instructions can be specific to a vendor or processor family. While the vector instructions for a particular vendor can be used directly in the code, modern compilers can also automate vectorisation and add these instructions to the code for you during the compilation process. This can be controlled using the command-line arguments described in table 1.
Compiler options
Vectorisation is usually enabled by default when -O2
optimisation level is enabled. However, each compiler has its own set of options to switch vectorisation on and off as well as to report on failed vectorisation attempts. Those options are summarised in table 1.
Table 1. Compiler options affecting vectorisation
Compiler | Vectorisation | Vectorisation information |
---|---|---|
Cray Fortran |
| -hlist=m |
Cray C/C++ |
| -fsave-loopmark -Rpass=loop-vectorize |
AOCC | -fvectorize | -Rpass=loop-vectorize |
GNU |
| -fopt-info-vec |
Intel |
| -vec-reportLEVEL
|
PGI |
| -Minfo=vect |