QBoard » Artificial Intelligence & ML » AI and ML - Tensorflow » How to train a keras model using multiple GPUs?

How to train a keras model using multiple GPUs?

  • Suppose there are multiple GPUs on a PC, how can all of them be used to train a keras model
      January 8, 2021 3:56 PM IST
    0
  • There are two ways to run a single model on multiple GPUs, data parallelism and device parallelism. In most cases, what you need is most likely data parallelism.

    1) Data parallelism

    Data parallelism consists of replicating the target model once on each device and using each replica to process a different fraction of the input data. The best way to do data parallelism with Keras models is to use the tf.distribute API.

    2) Model parallelism

    Model parallelism consists of running different parts of the same model on different devices. It works best for models that have a parallel architecture, e.g. a model with two branches. This can be achieved by using TensorFlow device scopes.

    You can also go through the below link for more information.

    https://keras.io/getting_started/faq/#how-can-i-train-a-keras-model-on-multiple-gpus-on-a-single-machine

      July 26, 2021 3:53 PM IST
    0
  • If there are multiple GPUs are on a single device, we can use MirroredStrategy API to train the keras model
    # Create a MirroredStrategy.
    strategy = tf.distribute.MirroredStrategy()
    print('Number of devices: {}'.format(strategy.num_replicas_in_sync))​
    # Open a strategy scope.
    with strategy.scope():
    #Define and compile the model
    model = Model(...)
    model.compile(...)​
    # Train the model on all available gpus.
    model.fit(train_dataset, validation_data=val_dataset, epochs=10)
    
    # Test the model on all available gpus.
    model.evaluate(test_dataset)​

    Related keras documentation: https://keras.io/guides/distributed_training/
      January 8, 2021 3:59 PM IST
    0