Column |
---|
Note |
---|
This page is still a work in progress and support for Machine Learning workload has just started. Please check it frequently for updates. |
|
...
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | DJango |
---|
title | Terminal 2. A simple interaction with the TensorFlow module. |
---|
| $ module load tensorflow/rocm5.6-tf2.12
$ python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-09-07 14:29:15.551224: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
>>> tf.__version__
'2.12.0'
>>> exit()
$ |
|
Here is another example of running a simple training script on a GPU node during an interactive session:
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | DJango |
---|
title | Terminal 3. Running a ML Python script interactively on a compute node |
---|
| $ salloc -p gpu --nodes=1 --gres=gpu:1 -A yourProjectName-gpu --time=00:20:00
salloc: Granted job allocation 4360927
salloc: Waiting for resource configuration
salloc: Nodes nid002828 are ready for job
$ module load tensorflow/rocm5.6-tf2.12
$ python3 01_horovod_mnist.py
2023-09-07 14:32:18.907641: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO:root:This is process with rank 0 and local rank 0
INFO:root:This is process with rank 0 and local rank 0: gpus available are: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2023-09-07 14:32:23.886297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 134200961 MB memory: -> device: 0, name: AMD Instinct MI250X, pci bus id: 0000:d1:00.0
[...]
INFO:root:This is process with rank 0 and local rank 0: my prediction is [[ -3.5764134 -6.1231604 -1.5476028 2.1744065 -14.56255 -5.4938045
-20.374353 12.388017 -3.1701622 -1.0773858]]
|
|
...
Column |
---|
|
Code Block |
---|
language | bash |
---|
theme | DJango |
---|
title | Terminal 5. The environment can be used once again. |
---|
| $ module load tensorflow/rocm5.6-tf2.12
$ bash
Singularity> source $MYSOFTWARE/manual/software/pythonEnvironments/tensorflowContainer-environments/myenv/bin/activate
(myenv) Singularity> |
|
...
FROM quay.io/pawsey/tensorflow:2.12.1.570-rocm5.6.0
To pull the image to your local desktop with Docker you can use:
$ docker pull quay.io/pawsey/tensorflow:2.12.1.570-rocm5.6.0
To know more about our recommendations of container builds with Docker and later translation into Singularity format for their use in Setonix please refer to the Containers Documentation.
...