horn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HORN-8) Implementation of Parameter Server
Date Mon, 01 Feb 2016 15:18:40 GMT

    [ https://issues.apache.org/jira/browse/HORN-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126378#comment-15126378
] 

ASF GitHub Bot commented on HORN-8:
-----------------------------------

GitHub user dongjinleekr opened a pull request:

    https://github.com/apache/incubator-horn/pull/9

    HORN-8: Implementation of Parameter Server

    Asynchronous parameter server implemented.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjinleekr/incubator-horn feature/parameter-server

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-horn/pull/9.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9
    
----
commit a8e7ff85e395855e4e292979e11732f207430e38
Author: Lee Dongjin <dongjin.lee.kr@gmail.com>
Date:   2016-01-31T13:11:56Z

    1. Separate master worker from slave workers. 2. Make master worker a dedicated Merger.
3. Fails when peer count < 2.

commit 8f412c6777fdd2434234ad316165cf165d12ffba
Author: Lee Dongjin <dongjin.lee.kr@gmail.com>
Date:   2016-01-31T15:11:45Z

    Change SmallLayeredNeuralNetworkTrainer#isConverge's type from boolean to AtomicBoolean

commit 91c0c796e76303a0e3cf27606fbc10a03d05ed0e
Author: Lee Dongjin <dongjin.lee.kr@gmail.com>
Date:   2016-02-01T15:15:22Z

    HORN-8: Implement asynchronous parameter server

----


> Implementation of Parameter Server
> ----------------------------------
>
>                 Key: HORN-8
>                 URL: https://issues.apache.org/jira/browse/HORN-8
>             Project: Apache Horn
>          Issue Type: Improvement
>            Reporter: Edward J. Yoon
>
> The current implementation works in synchronous way like below (SmallLayeredNeuralNetworkTrainer.java
101 lines):
> {code}
> task0        task1        task2
>       compute updates locally
> -------------------------------- sends updates to master task
> -------------------------------- merge updates and broadcast it to every tasks
>       compute updates locally
> -------------------------------- sends updates to master task
> -------------------------------- merge updates and broadcast it to every tasks
>      
>                ...
>       (Loop until onvergence)
> {code}
> By separating the master, we can support asynchronous parallel SGD. My idea is just using
of a task0 (BSPTask) as a server daemon. In this issue ticket, single master is enough at
this moment.
> {code}
> task0     |          task1                          ....   taskN
>           |
>           |
>           |   compute updates locally
>           |
>  Receive  |<------ push updates to master task
>  Update1  |                     
>           +------> fetch updates
>           |
>           |
>           |
>  Receive  |<------------------------------------ ..
>  Update2  |
>           +------------------------------------> ..
>           |
>           |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message