airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1331) Contrib.SparkSubmitOperator should allow --packages parameter
Date Sun, 24 Sep 2017 11:19:02 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178159#comment-16178159
] 

ASF subversion and git services commented on AIRFLOW-1331:
----------------------------------------------------------

Commit fbca8f0ad8a01364fd4ddb3b5b8b7f9e15660060 in incubator-airflow's branch refs/heads/v1-9-test
from [~hayashidac]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=fbca8f0 ]

[AIRFLOW-1331] add SparkSubmitOperator option

spark-submit has --packages option to use
additional java packages.
but current version of SparkSubmitOperator
couldn't handle it.
I added "packages" option to SparkSubmitOperator
to resolve it.
I added same option for TestSparkSubmitOperator,
too.

Closes #2622 from chie8842/AIRFLOW-1331

(cherry picked from commit e4a984a6b87888753415bdd4308c89622c983917)
Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>


> Contrib.SparkSubmitOperator should allow --packages parameter
> -------------------------------------------------------------
>
>                 Key: AIRFLOW-1331
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1331
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib
>            Reporter: manuel garrido
>
> Right now SparkSubmitOperator (and its related hook SparkSubmitHook) does not allow for
the parameter packages, an option very useful to pull packages from the spark-packages repository.
> I am not an expert by no means , but given how SparkSubmitHook builds the command to
submit a spark job this could be as easy as adding 
> {code:python}
>         if self._jars:
>             connection_cmd += ["--jars", self._jars]
> {code}
> Right under [this line](https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_submit_hook.py#L167),
as well as adding the *packages* parameter (defaulting to None) both in the SparkSubmitHook
and SparkSubmitOperator init methods (basically, anywhere where the jars parameter is called).
> To be honest I would not mind doing a pull request to fix this, however I am not knowledgeable
enough both about Airflow and how the Contribution guidelines are setup. I the community thinks
this could be an easy fix that a newbie like me can do (i do believe this) then please let
me know and I will do my best.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message