spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-3561) Decouple Spark's API from its execution engine
Date Fri, 03 Oct 2014 22:58:35 GMT

     [ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sandy Ryza updated SPARK-3561:
------------------------------
    Description: 
Currently Spark's user-facing API is tightly coupled with its backend execution engine.  
It could be useful to provide a point of pluggability between the two to allow Spark to run
on other DAG execution engines with similar distributed memory abstractions.

Proposal:
The proposed approach would introduce a pluggable JobExecutionContext (trait) - as a non-public
api (@DeveloperAPI) not exposed to end users of Spark.
The trait will define 4 only operations:
* hadoopFile
* newAPIHadoopFile
* broadcast
* runJob

Each method directly maps to the corresponding methods in current version of SparkContext.
JobExecutionContext implementation will be accessed by SparkContext via master URL as "execution-context:foo.bar.MyJobExecutionContext"
with default implementation containing the existing code from SparkContext, thus allowing
current (corresponding) methods of SparkContext to delegate to such implementation. An integrator
will now have an option to provide custom implementation of DefaultExecutionContext by either
implementing it from scratch or extending form DefaultExecutionContext.

Please see the attached design doc for more details.
Pull Request will be posted shortly as well

  was:
Currently Spark's API is tightly coupled with its backend execution engine.   It could be
useful to provide a point of pluggability between the two to allow Spark to run on other DAG
execution engines with similar distributed memory abstractions.

Proposal:
The proposed approach would introduce a pluggable JobExecutionContext (trait) - as a non-public
api (@DeveloperAPI) not exposed to end users of Spark.
The trait will define 4 only operations:
* hadoopFile
* newAPIHadoopFile
* broadcast
* runJob

Each method directly maps to the corresponding methods in current version of SparkContext.
JobExecutionContext implementation will be accessed by SparkContext via master URL as "execution-context:foo.bar.MyJobExecutionContext"
with default implementation containing the existing code from SparkContext, thus allowing
current (corresponding) methods of SparkContext to delegate to such implementation. An integrator
will now have an option to provide custom implementation of DefaultExecutionContext by either
implementing it from scratch or extending form DefaultExecutionContext.

Please see the attached design doc for more details.
Pull Request will be posted shortly as well


> Decouple Spark's API from its execution engine
> ----------------------------------------------
>
>                 Key: SPARK-3561
>                 URL: https://issues.apache.org/jira/browse/SPARK-3561
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 1.1.0
>            Reporter: Oleg Zhurakousky
>              Labels: features
>             Fix For: 1.2.0
>
>         Attachments: SPARK-3561.pdf
>
>
> Currently Spark's user-facing API is tightly coupled with its backend execution engine.
  It could be useful to provide a point of pluggability between the two to allow Spark to
run on other DAG execution engines with similar distributed memory abstractions.
> Proposal:
> The proposed approach would introduce a pluggable JobExecutionContext (trait) - as a
non-public api (@DeveloperAPI) not exposed to end users of Spark.
> The trait will define 4 only operations:
> * hadoopFile
> * newAPIHadoopFile
> * broadcast
> * runJob
> Each method directly maps to the corresponding methods in current version of SparkContext.
JobExecutionContext implementation will be accessed by SparkContext via master URL as "execution-context:foo.bar.MyJobExecutionContext"
with default implementation containing the existing code from SparkContext, thus allowing
current (corresponding) methods of SparkContext to delegate to such implementation. An integrator
will now have an option to provide custom implementation of DefaultExecutionContext by either
implementing it from scratch or extending form DefaultExecutionContext.
> Please see the attached design doc for more details.
> Pull Request will be posted shortly as well



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message