systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Janardhan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SYSTEMML-2083) Language and runtime for parameter servers
Date Wed, 14 Feb 2018 07:05:01 GMT

    [ https://issues.apache.org/jira/browse/SYSTEMML-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363568#comment-16363568
] 

Janardhan edited comment on SYSTEMML-2083 at 2/14/18 7:04 AM:
--------------------------------------------------------------

The light weight parameter server interface is [ps-lite|[https://github.com/dmlc/ps-lite].] ]
as a simple example.

In simple terms (takes 7 mins to read this explanation), let's say we have
{code:java}
to caculate weights, with help of gradients.{code}
 

1. How parameter server looks? contains workers, server and data.

!image-2018-02-14-12-18-48-932.png!

 

 

2. What worker do? takes a little data & *calculates gradients* from it & sends them
to server.

!image-2018-02-14-12-21-00-932.png!

 

3. What server do? get the gradients from workers and *calculates weights*.

  !image-2018-02-14-12-31-37-563.png!


was (Author: return_01):
The light weight parameter server interface is [ps-lite|[https://github.com/dmlc/ps-lite].] ]
as a simple example.

In simple terms, let's say we have (7 min read)
{code:java}
to caculate weights, with help of gradients.{code}
 

1. How parameter server looks? contains workers, server and data.

!image-2018-02-14-12-18-48-932.png!

 

 

2. What worker do? takes a little data & *calculates gradients* from it & sends them
to server.

!image-2018-02-14-12-21-00-932.png!

 

3. What server do? get the gradients from workers and *calculates weights*.

  !image-2018-02-14-12-31-37-563.png!

> Language and runtime for parameter servers
> ------------------------------------------
>
>                 Key: SYSTEMML-2083
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2083
>             Project: SystemML
>          Issue Type: Epic
>            Reporter: Matthias Boehm
>            Priority: Major
>              Labels: gsoc2018
>         Attachments: image-2018-02-14-12-18-48-932.png, image-2018-02-14-12-21-00-932.png,
image-2018-02-14-12-31-37-563.png
>
>
> SystemML already provides a rich set of execution strategies ranging from local operations
to large-scale computation on MapReduce or Spark. In this context, we support both data-parallel
(multi-threaded or distributed operations) as well as task-parallel computation (multi-threaded
or distributed parfor loops). This epic aims to complement the existing execution strategies
by language and runtime primitives for parameter servers, i.e., model-parallel execution.
We use the terminology of model-parallel execution with distributed data and distributed model
to differentiate them from the existing data-parallel operations. Target applications are
distributed deep learning and mini-batch algorithms in general. These new abstractions will
help making SystemML a unified framework for small- and large-scale machine learning that
supports all three major execution strategies in a single framework.
>  
> A major challenge is the integration of stateful parameter servers and their common push/pull
primitives into an otherwise functional (and thus, stateless) language. We will approach this
challenge via a new builtin function \{{paramserv}} which internally maintains state but at
the same time fits into the runtime framework of stateless operations.
> Furthermore, we are interested in providing (1) different runtime backends (local and
distributed), (2) different parameter server modes (synchronous, asynchronous, hogwild!, stale-synchronous),
(3) different update frequencies (batch, multi-batch, epoch), as well as (4) different architectures
for distributed data (1 parameter server, k workers) and distributed model (k1 parameter servers,
k2 workers). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message