systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LI Guobao (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SYSTEMML-2420) Communication between ps and workers
Date Mon, 30 Jul 2018 07:21:00 GMT

     [ https://issues.apache.org/jira/browse/SYSTEMML-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

LI Guobao resolved SYSTEMML-2420.
---------------------------------
       Resolution: Fixed
    Fix Version/s: SystemML 1.2

> Communication between ps and workers
> ------------------------------------
>
>                 Key: SYSTEMML-2420
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2420
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: LI Guobao
>            Assignee: LI Guobao
>            Priority: Major
>             Fix For: SystemML 1.2
>
>         Attachments: systemml_rpc_2_seq_diagram.png, systemml_rpc_class_diagram.png,
systemml_rpc_sequence_diagram.png
>
>
> It aims to implement the parameter exchange between ps and workers. We could leverage
netty framework to implement our own Rpc framework. In general, the netty {{TransportClient}}
and {{TransportServer}} provides the sending and receiving service for ps and workers. Extending
the {{RpcHandler}} allows to invoke the corresponding ps method (i.e., push/pull method) by
handling the different input Rpc call object. And then the {{SparkPsProxy}} wrapping {{TransportClient}}
allows the workers to execute the push/pull call to server. At the same time, the ps netty
server also provides the file repository service which allows the workers to download the
partitioned training data, so that the workers could rebuild the matrix object with the transfered
file instead of broadcasting all the files with spark which are not all necessary for each
worker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message