horn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject [DISCUSS] Some doubts in multi-GPUs
Date Mon, 16 Nov 2015 07:18:23 GMT
Hi forks,

According to http://tensorflow.org/tutorials/deep_cnn/index.md,

They update model parameters synchronously by waiting for all GPUs to
finish processing a batch of data. In here, at each iteration, the
bottleneck of switching to the next mini-batch (copying next set of
images and parameters) is expected.

If model size and mini-batch are fit into memory of each devices, this
make sense. But, model parallelism is quite different. Every devices
must communicate each others.

Here's some interesting result: "ASR needs many computational power,
but consumes only about 1 GB memory, so GPU is a good available option
which has much higher computational power compared to microprocessors
but with limited memory. To exploit multiple GPUs in one server, we
first built a model parallelism framework for ASR and gained a 1.5
times speedup with two GPUs. However, model parallelism has limited
scalability for ASR and could not achieve better performance when
using more than two GPUs. So we pay much more attention to data
parallelism for multi-GPU DNN framework" -- Tencent's Mariana.

Another one is that the weakness of random access of GPU memory. For
this reason, they decide to store and update all model parameters on
the CPU (see green box).

Do you think tensorflow is fit for training large neural networks?

Best Regards, Edward J. Yoon

View raw message