spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alon Shoham (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-24456) Spark submit - server environment variables are overwritten by client environment variables
Date Mon, 04 Jun 2018 05:55:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alon Shoham updated SPARK-24456:
--------------------------------
    Description: 
When submitting a spark application in --deploy-mode cluster + spark standalone cluster, environment
variables from the client machine overwrite server environment variables. 

 

We use *SPARK_DIST_CLASSPATH* environment variable to add extra required dependencies to the
application. We observed that client machine SPARK_DIST_CLASSPATH overwrite remote server
machine value, resulting in application submission failure. 

 

We have inspected the code and found:

1. In org.apache.spark.deploy.Client line 86:
{code:java}
val command = new Command(mainClass,
 Seq("{{WORKER_URL}}", "{{USER_JAR}}", driverArgs.mainClass) ++ driverArgs.driverOptions,
 sys.env, classPathEntries, libraryPathEntries, javaOpts){code}
2. In org.apache.spark.launcher.WorkerCommandBuilder line 35:
{code:java}
childEnv.putAll(command.environment.asJava)
childEnv.put(CommandBuilderUtils.ENV_SPARK_HOME, sparkHome){code}
Seen in line 35  is that the environment is overwritten in the server machine but in line
36 the SPARK_HOME is restored to the server value.

We think the bug can be fixed by adding a line that restores SPARK_DIST_CLASSPATH to its server
value, similar to SPARK_HOME

 

  was:
When submitting a spark application in --deploy-mode cluster + spark standalone cluster, environment
variables from the client machine overwrite server environment variables. 

 

We use *SPARK_DIST_CLASSPATH* environment variable to add extra required dependencies to the
application. We observed that client machine SPARK_DIST_CLASSPATH overwrite remote server
machine value, resulting in application submission failure. 

 

We have inspected the code and found:

1. In org.apache.spark.deploy.Client line 86:

{{val command = new Command(mainClass,}}
{{ \{{ {{ Seq("}}}}{{WORKER_URL}}{{", "}}{{USER_JAR}}{{", driverArgs.mainClass) ++ driverArgs.driverOptions,}}{{}}}}
{{ {{ {{ *sys.env,* classPathEntries, libraryPathEntries, javaOpts)}}}}}}

2. In org.apache.spark.launcher.WorkerCommandBuilder line 35:

{{childEnv.putAll(command.environment.asJava)}}
 {{childEnv.put(CommandBuilderUtils.ENV_SPARK_HOME, sparkHome)}}

Seen in line 35  is that the environment is overwritten in the server machine but in line
36 the SPARK_HOME is restored to the server value.

We think the bug can be fixed by adding a line that restores SPARK_DIST_CLASSPATH to its server
value, similar to SPARK_HOME

 


> Spark submit - server environment variables are overwritten by client environment variables

> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24456
>                 URL: https://issues.apache.org/jira/browse/SPARK-24456
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.3.0
>            Reporter: Alon Shoham
>            Priority: Minor
>
> When submitting a spark application in --deploy-mode cluster + spark standalone cluster,
environment variables from the client machine overwrite server environment variables. 
>  
> We use *SPARK_DIST_CLASSPATH* environment variable to add extra required dependencies
to the application. We observed that client machine SPARK_DIST_CLASSPATH overwrite remote
server machine value, resulting in application submission failure. 
>  
> We have inspected the code and found:
> 1. In org.apache.spark.deploy.Client line 86:
> {code:java}
> val command = new Command(mainClass,
>  Seq("{{WORKER_URL}}", "{{USER_JAR}}", driverArgs.mainClass) ++ driverArgs.driverOptions,
>  sys.env, classPathEntries, libraryPathEntries, javaOpts){code}
> 2. In org.apache.spark.launcher.WorkerCommandBuilder line 35:
> {code:java}
> childEnv.putAll(command.environment.asJava)
> childEnv.put(CommandBuilderUtils.ENV_SPARK_HOME, sparkHome){code}
> Seen in line 35  is that the environment is overwritten in the server machine but in
line 36 the SPARK_HOME is restored to the server value.
> We think the bug can be fixed by adding a line that restores SPARK_DIST_CLASSPATH to
its server value, similar to SPARK_HOME
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message