reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergiy Matusevych (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1791) Implement reef-runtime-spark
Date Wed, 13 Sep 2017 23:07:00 GMT

    [ https://issues.apache.org/jira/browse/REEF-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165441#comment-16165441
] 

Sergiy Matusevych commented on REEF-1791:
-----------------------------------------

bq. 1) What are the negatives in running in unmanaged AM mode , what happens if the code runs
into any performance issues, how will it recover, how will the user manage this, seems like
this is placing more responsiblity on the user , he/she may or may not have this knowledge

By design, REEF assumes that the user takes full responsibility of the app. This is done because
we want the user to be in control as much as possible while providing sane defaults. Running
REEF from Spark is no different - we assume that the user will implement all the necessary
event handlers for the failure events if the defaults are not sufficient for the use case.
What is different for the Unmanaged AM mode is that the REEF Driver launched from Spark must
also respond to the (failure) events originated from Spark, and we currently do not have mechanisms
to forward Spark events to the REEF app _transparently_ - the user has to do it by hand. Other
than Spark-REEF event forwarding, performance and error recovery, as well as the usability
issues are not directly relevant to this PR - we can discuss them later elsewhere.

bq. 2) I would like to see an end to end User Interaction Diagram , maybe when we meet we
can discuss this

Let's talk about it in the meeting and post a picture here.

bq. 3) Is it possible to make the partitions configurable in the DataLoader Service, in general
I'd like to understand how this can be specified

I am not sure what parameters you are talking about. Please take a look at the {{DataLoadingRequestBuilder}}
and let me know what other parameters we might need for Spark integration.

bq. 4) What are the tradeoffs between using the EvaluatorRequestor versus DataLoader, if the
goal is to not have too much dependency on spark internals it seems like the dataloader is
a better approach

In my opinion, {{EvaluatorRequestor}} is more flexible as it allows us to request additional
partitions (and potentially the new datasets) at runtime. OTOH, {{DataLoader}} can be easier
to implement and it should cover 99% of our needs. In the long run, we may end up with both
approaches implemented.

bq. 5) I would postpone the low level spark API till the first part using the EvaluatorRequestor
or the DataLoader is complete

That depends on how hard it is to implement the `EvaluatorRequestor` using low-level Spark
API. If done properly, it can give us a proper REEF+Spark runitme that is completely transparent
to the end user; then we won't need any workarounds like DataLoader or custom data-driven
SparkEvaluatorRequestor. Still, I would much prefer a workaround that would allow us to move
forward with Spark+REEF.NET integration now and come back to the low-level solution later.

bq. 6) In the Reef.net Bridge I would recommend launching a .NET vm as a separate process
to avoid using JNI and not being able to use spark-submit.

I agree.

> Implement reef-runtime-spark
> ----------------------------
>
>                 Key: REEF-1791
>                 URL: https://issues.apache.org/jira/browse/REEF-1791
>             Project: REEF
>          Issue Type: New Feature
>          Components: REEF
>            Reporter: Sergiy Matusevych
>            Assignee: Saikat Kanjilal
>         Attachments: file-1.jpeg, file.jpeg
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We need to run REEF Tasks on Spark Executors. Ideally, that should require only a few
lines of changes in the REEF application configuration. All Spark-related logic must be encapsulated
in the {{reef-runtime-spark}} module, similar to the existing e.g. {{reef-runtime-yarn}} or
{{reef-runtime-local}}. As a first step, we can have a Java-only solution, but later we'll
need to run .NET Tasks on Executors as well.
> h3. P.S. Here's a REEF Wiki page with more details: *[Spark+REEF integration|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=73636401]*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message