reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DongJin Shin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (REEF-1123) Develop a ssh-based standalone runtime
Date Mon, 18 Jan 2016 08:32:39 GMT

     [ https://issues.apache.org/jira/browse/REEF-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

DongJin Shin updated REEF-1123:
-------------------------------
    Description: 
REEF currently has a local runtime(reef-runtime-local) for dev/test. It has served its purpose
well. However, some users expressed interests in doing this in a distributed fashion.

So, it'd be good to develop a ssh-based standalone distributed runtime. Being ssh-based means
that there's less code for us to write and easy for users to deploy. Being standalone means
that users don't need to set up separate resource managers(e.g., YARN, Mesos) to run REEF
jobs on multiple nodes.

Design Plan
  - {{scp}} required file to remote nodes and launch them through ssh
  - Tasks are submitted through ssh connection and run as foreground, enabling driver to receive
return status
  - Evaluator logs are saved temporarily on $HOME/REEF_SSH_RUNTIME of each node, which will
be sent to driver after task is finished

  was:
REEF currently has a local runtime(reef-runtime-local) for dev/test. It has served its purpose
well. However, some users expressed interests in doing this in a distributed fashion.

So, it'd be good to develop a ssh-based standalone distributed runtime. Being ssh-based means
that there's less code for us to write and easy for users to deploy. Being standalone means
that users don't need to set up separate resource managers(e.g., YARN, Mesos) to run REEF
jobs on multiple nodes.


> Develop a ssh-based standalone runtime
> --------------------------------------
>
>                 Key: REEF-1123
>                 URL: https://issues.apache.org/jira/browse/REEF-1123
>             Project: REEF
>          Issue Type: New Feature
>            Reporter: John Yang
>            Assignee: DongJin Shin
>
> REEF currently has a local runtime(reef-runtime-local) for dev/test. It has served its
purpose well. However, some users expressed interests in doing this in a distributed fashion.
> So, it'd be good to develop a ssh-based standalone distributed runtime. Being ssh-based
means that there's less code for us to write and easy for users to deploy. Being standalone
means that users don't need to set up separate resource managers(e.g., YARN, Mesos) to run
REEF jobs on multiple nodes.
> Design Plan
>   - {{scp}} required file to remote nodes and launch them through ssh
>   - Tasks are submitted through ssh connection and run as foreground, enabling driver
to receive return status
>   - Evaluator logs are saved temporarily on $HOME/REEF_SSH_RUNTIME of each node, which
will be sent to driver after task is finished



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message