horn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HORN-2) Umbrella ticket for Implementation Planning of Apache Horn
Date Thu, 19 Nov 2015 02:15:11 GMT

    [ https://issues.apache.org/jira/browse/HORN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012634#comment-15012634
] 

Edward J. Yoon commented on HORN-2:
-----------------------------------

Here's my idea after thinking more:

We provides two batch modes for training model: 1) mini-batch 2) online. As you know, the
legacy mini-batch SGD code is fit for tensor/matrix approach and GPU can be easily used. Another
one is online SGD based on iterative computing like Pregel. It can be slower than mini-batch
but can be useful for large model I think.

> Umbrella ticket for Implementation Planning of Apache Horn
> ----------------------------------------------------------
>
>                 Key: HORN-2
>                 URL: https://issues.apache.org/jira/browse/HORN-2
>             Project: Apache Horn
>          Issue Type: Wish
>            Reporter: Edward J. Yoon
>
> My old rough idea is described here: http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html
> The basic idea of data and model parallelism is use of the remote parameter server to
parallelize model creation and distribute training across machines, and the region barrier
synchronization per task group instead of global barrier synchronization for performing asynchronous
mini-batches within single BSP job.
> Since Apache Hama provides pluggable interface for Synchronization[1], we can easily
create our own region barrier synchronization service for handling multiple BSP worker groups
(Regarding management of Tasks Topology, I have no concrete idea yet).
> Parameter Server requires decision whether to use Legacy open source or implement ourself.
> My rough Programming Inteface Design is only focused on feed-forward networks such as
MLP, CNN, and Autoencoder. We may want to conver everything.
> 1. http://wiki.apache.org/hama/SyncService



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message