Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Column
width900


Warning
titleTensorflow module has known issues

After June 2024 maintenance, the moduleĀ tensorflow/rocm5.6-tf2.12 has shown some problems. For temporary fix check: June 2024 Software Update - Important Information



In this tutorial, you are going to see how to write a Horovod-powered distributed TensorFlow computation. More specifically, the final goal is to train different models in parallel by assigning each of them to a different GPU. The discussion is organised in two sections. The first section illustrates Horovod's basic concepts and its usage coupled with TensorFlow, the second one uses the MNIST classification task as test case.

...