airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (Jira)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-5744) Environment variables not correctly set in Spark submit operator
Date Wed, 18 Dec 2019 03:17:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998772#comment-16998772
] 

ASF subversion and git services commented on AIRFLOW-5744:
----------------------------------------------------------

Commit dd90fb2a211abbd4bd2eefe27ffb0decbefeb4d8 in airflow's branch refs/heads/v1-10-test
from Joseph McCartin
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=dd90fb2 ]

[AIRFLOW-5744] Environment variables not correctly set in Spark submit operator (#6796)


(cherry picked from commit 699aea8ee368abcba29d717daf2580f897ab9d93)


> Environment variables not correctly set in Spark submit operator
> ----------------------------------------------------------------
>
>                 Key: AIRFLOW-5744
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5744
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib, operators
>    Affects Versions: 1.10.5
>            Reporter: Joseph McCartin
>            Assignee: Joseph McCartin
>            Priority: Trivial
>             Fix For: 1.10.7
>
>
> AIRFLOW-2380 added support for setting environment variables at runtime for the SparkSubmitOperator.
The intention was to allow for dynamic configuration paths (such as HADOOP_CONF_DIR). The
pull request, however, only made it so that these env vars would only be set at runtime if
a standalone cluster and a client deploy mode was chosen. For kubernetes and yarn modes, the
env vars would be sent to the driver via the spark arguments _spark.yarn.appMasterEnv_ (and
equivalent for k8s).
> If one wishes to dynamically set the yarn master address (via a _yarn-site.xml_ file),
then one or more environment variables __ need to be present at runtime, and this is not
currently done.
> The SparkSubmitHook class var `_env` is assigned the `_env_vars` variable from the SparkSubmitOperator,
in the `_build_spark_submit_command` method. If running in YARN mode however, this is not
set as it should be, and therefore `_env` is not passed to the Popen process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message