...
First, the dataset is loaded from tensorFlow
module (being a standard dataset for test cases, Tensorflow provides a convenient function to retrieve it) and then split in two parts, one for training and the other for testing. What follows is the definition of the model and the loss function. Until now, every process executes the same code. They diverge when the model.fit
function is called. Indeed, the training dataset is implicitly partitioned using the size of the computation and rank of a process. Each process gets a different portion of samples because the rank is unique among all processes. Therefore, each trained model is different from one another. To prove this, each model is evaluated using the same test set through the model.evaluate
call. If you run the Python program adding this last part you should see that the accuracy reported from every task is slightly different. You can use the rank and size values in if
statements to train completely different models and, in general, make each process follow a different execution path.