horn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HORN-2) Umbrella ticket for Implementation Planning of Apache Horn
Date Thu, 08 Oct 2015 13:33:26 GMT

    [ https://issues.apache.org/jira/browse/HORN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948650#comment-14948650
] 

Edward J. Yoon commented on HORN-2:
-----------------------------------

As I mentioned in mailing list, BSP framework need to cover Parameter Server too.

I was thought the Parameter Server just as a external in-memory database that handles concurrent
access efficiently. However, using BSP framework, we just might be able to launch both Parameter
Servers and Worker tasks at once. Maybe, we must do like that! We divide BSPTasks into two
server/worker groups first and then divide workers into several worker groups for training
each model in parallel.

In this case, the main advantage is that Cluster topology for distributed training can easily
be configured. With this, we can cover the Downpour, Sandblaster, and Caffe-like Distributed
Hogwild on a single framework.

> Umbrella ticket for Implementation Planning of Apache Horn
> ----------------------------------------------------------
>
>                 Key: HORN-2
>                 URL: https://issues.apache.org/jira/browse/HORN-2
>             Project: Apache Horn
>          Issue Type: Wish
>            Reporter: Edward J. Yoon
>
> My old rough idea is described here: http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html
> The basic idea of data and model parallelism is use of the remote parameter server to
parallelize model creation and distribute training across machines, and the region barrier
synchronization per task group instead of global barrier synchronization for performing asynchronous
mini-batches within single BSP job.
> Since Apache Hama provides pluggable interface for Synchronization[1], we can easily
create our own region barrier synchronization service for handling multiple BSP worker groups
(Regarding management of Tasks Topology, I have no concrete idea yet).
> Parameter Server requires decision whether to use Legacy open source or implement ourself.
> My rough Programming Inteface Design is only focused on feed-forward networks such as
MLP, CNN, and Autoencoder. We may want to conver everything.
> 1. http://wiki.apache.org/hama/SyncService



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message