Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Column


Note

This page is still a work in progress and support for Machine Learning workload has just started. Please check it frequently for updates.


...

Column
width900px


Code Block
languagebash
themeDJango
titleTerminal 2. A simple interaction with the TensorFlow module.
$ module load tensorflow/rocm5.6-tf2.12 

$ python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-09-07 14:29:15.551224: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
>>> tf.__version__
'2.12.0'
>>> exit()

$ 


Here is another example of running a simple training script on a GPU node during an interactive session:

Column
width900px


Code Block
languagebash
themeDJango
titleTerminal 3. Running a ML Python script interactively on a compute node
$ salloc -p gpu --nodes=1 --gres=gpu:1 -A yourProjectName-gpu --time=00:20:00
salloc: Granted job allocation 4360927
salloc: Waiting for resource configuration
salloc: Nodes nid002828 are ready for job

$ module load tensorflow/rocm5.6-tf2.12  

$ python3 01_horovod_mnist.py 
2023-09-07 14:32:18.907641: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO:root:This is process with rank 0 and local rank 0
INFO:root:This is process with rank 0 and local rank 0: gpus available are: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2023-09-07 14:32:23.886297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 134200961 MB memory:  -> device: 0, name: AMD Instinct MI250X, pci bus id: 0000:d1:00.0
[...]
INFO:root:This is process with rank 0 and local rank 0: my prediction is [[ -3.5764134  -6.1231604  -1.5476028   2.1744065 -14.56255    -5.4938045
  -20.374353   12.388017   -3.1701622  -1.0773858]]



...

Column
width900px


Code Block
languagebash
themeDJango
titleTerminal 5. The environment can be used once again.
$ module load tensorflow/rocm5.6-tf2.12   

$ bash

Singularity> source $MYSOFTWARE/manual/software/pythonEnvironments/tensorflowContainer-environments/myenv/bin/activate
(myenv) Singularity>



...

FROM quay.io/pawsey/tensorflow:2.12.1.570-rocm5.6.0

To pull the image to your local desktop with Docker you can use:

$ docker pull quay.io/pawsey/tensorflow:2.12.1.570-rocm5.6.0

To know more about our recommendations of container builds with Docker and later translation into Singularity format for their use in Setonix please refer to the Containers Documentation.

...