singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SINGA-131) Implement and optimize hybrid training using both CPU and GPU
Date Wed, 06 Apr 2016 12:29:25 GMT

    [ https://issues.apache.org/jira/browse/SINGA-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15228180#comment-15228180
] 

ASF subversion and git services commented on SINGA-131:
-------------------------------------------------------

Commit a91e82f3c8771980bff916511a7c750f5f6d039d in incubator-singa's branch refs/heads/master
from [~flytosky]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-singa.git;h=a91e82f ]

SINGA-131 Implement and optimize hybrid training using both CPU and GPU

Allow users set the batchsize of instances of StoreInputLayer manually.
We can then assign different workload for GPU and CPU workers who have
different input layers, e.g.,
```
 batchsize: 128
 batchsize: 16
 ```
 If the first worker is GPU and the second worker is CPU, then the
 above setting would assign 128 images per mini-batch to the GPU worker
 and assign 16 images to the CPU worker.

 Internally, StoreInputLayer gets its batchsize based on its partition
 ID.
 Currently, it works for MNIST example which has no Conv layers.
 For Conv layers, since the GPU and CPU implementations have different
 layer names, we cannot using the single net config.


> Implement and optimize hybrid training using both CPU and GPU
> -------------------------------------------------------------
>
>                 Key: SINGA-131
>                 URL: https://issues.apache.org/jira/browse/SINGA-131
>             Project: Singa
>          Issue Type: Improvement
>            Reporter: wangwei
>              Labels: CPU, GPU, hybrid
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> We discussed with researchers from Stanford on implementing hybrid training before
> http://mail-archives.apache.org/mod_mbox/singa-dev/201507.mbox/%3CCAJz0iLsd5iSCqqVU4QHLKzMO2o%2BFt-40kN8RgWkYhDn%3D6Qqqbw%40mail.gmail.com%3E.
> Now with the GPU training supported, we can move on to this feature.
> The distributed training framework is natural for hybrid training with CPU and GPU. The
first n workers would be assigned with GPU cards (n is the number of cards configured by users),
and the rest workers would run on CPU.
> Some code may need updates and optimization to consider the memory transferring between
GPU workers and CPU workers. Most of them is in worker.cc, param.cc and stub.cc.
> Automatically Tuning the workload among GPU and CPU could be designed and implemented
in this ticket or a new ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message