Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...


Column
Excerpt


Machine Learning workloads are supported on Setonix through a
Note

This page is still a work in progress and support for Machine Learning workload has just started. Please check it frequently for updates.



Excerpt

Machine Learning workloads are supported on Setonix through a custom TensorFlow container developed by Pawsey. This page illustrates its usage.

...

Setonix can support Machine Learning workloads thanks to with the use of the large number of AMD GPUs installed on the system. AMD maintains a TensorFlow branch with added support for its GPUs. An official AMD container is also available but lacks both , unfortunately, it unusable on Setonix due to its lack of support for both, the Cray MPI and some core Python packages, making it unusable on Setonix. For this reason. Nevertheless, Pawsey has developed its own TensorFlow container, which is installed can properly run on Setonix. This container is installed on Setonix and available through the module system. The Pawsey TensorFlow container is the only supported way to run TensorFlow on Setonix.

Our TensorFlow container is publicly distributed on quay.io (external site). To pull it with docker, and to build on top of it, you can use the command

$ docker pull quay.io/pawsey/tensorflow:2.12.1.570-rocm5.6.0 

...

Note

This page is still a work in progress and support for Machine Learning workload has just started. Please check it frequently for updates.

The TensorFlow module

Currently, TensorFlow is available on Setonix via modules that make use of containers:

...

width900px

...

languagebash
themeDJango
titleTerminal 1. Look for the TensorFlow module

...


The TensorFlow module

Currently, TensorFlow is available on Setonix via modules that make use of containers:

Column
width900px


Code Block
languagebash
themeDJango
titleTerminal 1. Look for the TensorFlow module
$ module avail tensorflow
--------------------------------------------------------- /software/setonix/2023.08/containers/views/modules -------------------------------------------------------------
tensorflow/rocm5.6-tf2.12 (D)


...

As you might have guessed we use the bash  alias to call the BASH interpreter within the container to execute a BASH command line which activates the environment and invokes python3 to execute the script. For a more complex sequence of commands, it is advised to create a support BASH script to be executed with the bash  alias. (If no virtual environment is needed, then sections related to this can be ommited from the script.). Here the name of the python training script is just an example, although the name has been taken from the examples that are described in the following pages of this topic.

Users' own containers built on top of Pawsey Tensorflow container

Pawsey TensorFlow container image is publicly distributed on quay.io (external site). We recommend to use a local installation of Docker in your own desktop to build your container on top of Pawsey's TensorFlow image starting your Dockerfile with:

FROM quay.io/pawsey/tensorflow:2.12.1.570-rocm5.6.0

To pull the image to your local desktop with Docker you can use:

$ docker pull quay.io/pawsey/tensorflow:2.12.1.570-rocm5.6.0

To know more about our recommendations of container builds with Docker and later translation into Singularity format for their use in Setonix please refer to the Containers Documentation.