systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LI Guobao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SYSTEMML-2085) Single-node parameter server primitives
Date Sun, 13 May 2018 16:27:00 GMT

     [ https://issues.apache.org/jira/browse/SYSTEMML-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

LI Guobao updated SYSTEMML-2085:
--------------------------------
    Description: 
A single node parameter server acts as a data-parallel parameter server. And a multi-node
model parallel parameter server will be discussed if time permits. 

A diagram of the parameter server architecture is shown below.

  was:
A single node parameter server acts as a data-parallel parameter server. And a multi-node
model parallel parameter server will be discussed if time permits. 

Synchronization:

We also need to implement the synchronization between workers and parameter server to be able
to bring more parameter update strategies, e.g., the stale-synchronous strategy needs a hyperparameter
"staleness" to define the waiting interval. The idea is to maintain a vector clock recording
all workers' clock in the server. Each time when an iteration in side of worker finishes,
it waits server to give a signal, i.e., to send a request for calculating the staleness according
to the vector clock. And when the server receives the gradients from certain worker, it will
increment the vector clock for this worker. So we could define BSP as "staleness==0", ASP
as "staleness==-1" and SSP as "staleness==N".

A diagram of the parameter server architecture is shown below.


> Single-node parameter server primitives
> ---------------------------------------
>
>                 Key: SYSTEMML-2085
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2085
>             Project: SystemML
>          Issue Type: Technical task
>            Reporter: Matthias Boehm
>            Assignee: LI Guobao
>            Priority: Major
>         Attachments: ps.png
>
>
> A single node parameter server acts as a data-parallel parameter server. And a multi-node
model parallel parameter server will be discussed if time permits. 
> A diagram of the parameter server architecture is shown below.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message