QBoard » Artificial Intelligence & ML » AI and ML - Others » Does Google Cloud ML support GPU?

Does Google Cloud ML support GPU?

  • I'm testing Google Cloud ML for speeding up my ML model using Tensorflow.

    Unfortunately, it seems like Google Cloud ML is extremely slow. My Mainstream-Level PC is at least 10x faster than Google Cloud ML.

    I doubt it uses GPU, so I did a test. I modified a sample code to force using GPU.

    diff --git a/mnist/trainable/trainer/task.py b/mnist/trainable/trainer/task.py index 9acb349..a64a11d 100644 --- a/mnist/trainable/trainer/task.py +++ b/mnist/trainable/trainer/task.py @@ -131,11 +131,12 @@ def run_training(): images_placeholder, labels_placeholder = placeholder_inputs( FLAGS.batch_size) - # Build a Graph that computes predictions from the inference model. - logits = mnist.inference(images_placeholder, FLAGS.hidden1, FLAGS.hidden2) + with tf.device("/gpu:0"): + # Build a Graph that computes predictions from the inference model. + logits = mnist.inference(images_placeholder, FLAGS.hidden1, FLAGS.hidden2) - # Add to the Graph the Ops for loss calculation. - loss = mnist.loss(logits, labels_placeholder) + # Add to the Graph the Ops for loss calculation. + loss = mnist.loss(logits, labels_placeholder) # Add to the Graph the Ops that calculate and apply gradients. train_op = mnist.training(loss, FLAGS.learning_rate)

    This training code works at my PC (gcloud beta ml local train ...) but not in cloud. It gives errors like this:

    "Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 239, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 235, in main run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 177, in run_training sess.run(init) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run feed_dict_string, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run target_list, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call raise type(e)(node_def, op, message) InvalidArgumentError: Cannot assign a device to node 'softmax_linear/biases': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0 Colocation Debug Info: Colocation group had the following types and devices: ApplyGradientDescent: CPU Identity: CPU Assign: CPU Variable: CPU [[Node: softmax_linear/biases = Variable[container="", dtype=DT_FLOAT, shape=[10], shared_name="", _device="/device:GPU:0"]()]]
      May 23, 2019 1:07 PM IST
    0
  • GPUs are now in Beta and all Cloud ML customers have access.

    Here are the docs for using GPUs with Cloud ML.

      May 23, 2019 1:11 PM IST
    0
  • GPUs are now in Beta and all Cloud ML customers have access.
    Here are the docs for using GPUs with Cloud ML.
      September 30, 2021 1:57 PM IST
    0
  • Google Cloud provides several GPU options. These GPUs can be selected as part of two Google instance types: Accelerator-Optimized High-GPU with 7 GB of RAM, 12–96 Cascade Lake CPUs, and SSD storage. Accelerator-Optimized Mega-GPU with 14 GB of RAM, 96 Cascade Lake CPUs, and SSD storage.
      August 26, 2021 5:34 PM IST
    0