Vectorisation

Vectorisation is a feature of modern processors that can execute a single instruction on multiple data objects in parallel within a single core.

Adapting your application to take advantage of vectorisation is important for achieving high performance on today's computer architectures.

Prerequisite knowledge

In order to get started with vectorisation you should first be familiar with how to install and run your program, including knowing how to update compilation flags for compiled languages. You should also be familiar with the programming language used in your program.

Vectorisation example

To demonstrate the process of vectorisation, we will start with a simple loop that adds entries of two vectors a and b and stores the results in vector c.

Listing 1. Original loop
do i=1,1024
  c(i)=a(i)+b(i)
end do

In the first phase the loop is unrolled into the statements shown in listing 2.

Listing 2. Unrolled loop
do i=1,1024,4
  c(i)=a(i)+b(i)
  c(i+1)=a(i+1)+b(i+1)
  c(i+2)=a(i+2)+b(i+2)
  c(i+3)=a(i+3)+b(i+3)
end do

Now the loop can be transformed into its vector equivalent.

Listing 3. Vectorised loop
do i=1,1024,4
  load into vector registers vec_a and vec_b
  vec_c = vec_a + vec_b  
  store from vector register vec_c
end do

In the vectorised loop shown in listing 3 the four consecutive entries of vectors a and b are first loaded into their corresponding vector registers vec_a and vec_b. The vector addition instruction is then executed and the result is written to vec_c. Four resulting entries are stored in appropriate addresses in c.

The vector instructions can be specific to a vendor or processor family. While the vector instructions for a particular vendor can be used directly in the code, modern compilers can also automate vectorisation and add these instructions to the code for you during the compilation process. This can be controlled using the command-line arguments described in table 1.

Compiler options

Vectorisation is usually enabled by default when -O2 optimisation level is enabled. However, each compiler has its own set of options to switch vectorisation on and off as well as to report on failed vectorisation attempts. Those options are summarised in table 1.


Table 1. Compiler options affecting vectorisation

Compiler

Vectorisation

Vectorisation information

Cray Fortran

-h vectorLEVEL

-hlist=m 

Cray C/C++

-fvectorize

-fsave-loopmark -Rpass=loop-vectorize
AOCC-fvectorize-Rpass=loop-vectorize

GNU

-ftree-vectorize

-fopt-info-vec

Intel

-xAVX, -xCORE-AVX2,-xMIC-AVX512

-vec-reportLEVEL

PGI

-M[no]vect

-Minfo=vect

Related pages