systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LI Guobao (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (SYSTEMML-2324) Synchronization
Date Tue, 05 Jun 2018 16:42:00 GMT

     [ https://issues.apache.org/jira/browse/SYSTEMML-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

LI Guobao closed SYSTEMML-2324.
-------------------------------
       Resolution: Duplicate
    Fix Version/s: SystemML 1.2

> Synchronization
> ---------------
>
>                 Key: SYSTEMML-2324
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2324
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: LI Guobao
>            Assignee: LI Guobao
>            Priority: Major
>             Fix For: SystemML 1.2
>
>
> We also need to implement the synchronization between workers and parameter server to
be able to bring more parameter update strategies, e.g., the stale-synchronous strategy needs
a hyperparameter "staleness" to define the waiting interval. The idea is to maintain a vector
clock recording all workers' clock in the server. Each time when an iteration in worker
side finishes, it sends a signal to server for incrementing its clock and then it sends another
request for asking whether to wait or not. When the server receives this request, it will
determine whether the worker should continue or not according to the different strategies.
So we could define BSP with "staleness==0" and SSP with "staleness==N". For the ASP, we
do not need to calculate the time gap between the quickest worker and the slowest one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message