spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhoutai.zt (JIRA)" <>
Subject [jira] [Created] (SPARK-23843) Deploy yarn meets incorrect LOCALIZED_CONF_DIR
Date Mon, 02 Apr 2018 07:53:00 GMT
zhoutai.zt created SPARK-23843:

             Summary: Deploy yarn meets incorrect LOCALIZED_CONF_DIR
                 Key: SPARK-23843
             Project: Spark
          Issue Type: Bug
          Components: Deploy
    Affects Versions: 2.3.0
         Environment: spark-2.3.0-bin-hadoop2.7
            Reporter: zhoutai.zt

We have implement a new Hadoop-compatible filesystem and run spark on it. The commands is:
{quote}./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode
cluster --executor-memory 1G --num-executors 1 /home/hadoop/app/spark-2.3.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.0.jar
The result is:
{quote}Exception in thread "main" org.apache.spark.SparkException: Application application_1522399820301_0020
d with failed status
We set log level to DEBUG and find:
{quote}2018-04-02 09:36:09,603 DEBUG org.apache.spark.deploy.yarn.Client: __app__.jar ->
resource \{ scheme: "dfs" host: "" port:
10290 file: "/user/hadoop/.sparkStaging/application_1522399820301_0006/spark-examples_2.11-2.3.0.jar"
} size: 1997548 timestamp: 1522632978000 type: FILE visibility: PRIVATE
2018-04-02 09:36:09,603 DEBUG org.apache.spark.deploy.yarn.Client: __spark_libs__ -> resource
\{ scheme: "dfs" host: "" port: 10290
file: "/user/hadoop/.sparkStaging/application_1522399820301_0006/"
} size: 242801307 timestamp: 1522632977000 type: ARCHIVE visibility: PRIVATE
2018-04-02 09:36:09,603 DEBUG org.apache.spark.deploy.yarn.Client: __spark_conf__ -> resource
\{ port: -1 file: "/user/hadoop/.sparkStaging/application_1522399820301_0006/"
} size: 185531 timestamp: 1522632978000 type: ARCHIVE visibility: PRIVATE
As shown, __app__.jar and __spark_libs__ ‘s information are all correct. BUT __spark_conf__
has no port, scheme.

We explore the source code, addResource appears two times in Client.scala
val destPath = copyFileToRemote(destDir, localPath, replication, symlinkCache)
val destFs = FileSystem.get(destPath.toUri(), hadoopConf)
destFs, hadoopConf, destPath, localResources, resType, linkname, statCache,
appMasterOnly = appMasterOnly)
val remoteConfArchivePath = new Path(destDir, LOCALIZED_CONF_ARCHIVE) val remoteFs = FileSystem.get(remoteConfArchivePath.toUri(),
hadoopConf) sparkConf.set(CACHED_CONF_ARCHIVE, remoteConfArchivePath.toString()) val localConfArchive
= new Path(createConfArchive().toURI()) copyFileToRemote(destDir, localConfArchive, replication,
symlinkCache, force = true, destName = Some(LOCALIZED_CONF_ARCHIVE)) // Manually add the config
archive to the cache manager so that the AM is launched with // the proper files set up. 
distCacheMgr.addResource( remoteFs, hadoopConf, remoteConfArchivePath, localResources, LocalResourceType.ARCHIVE,
LOCALIZED_CONF_DIR, statCache, appMasterOnly = false)
As shown in the source code, the destPaths are differently constructed. And this is confirmed
by self added debug log
{quote}2018-04-02 15:18:46,357 ERROR org.apache.hadoop.yarn.util.ConverterUtils: getYarnUrlFromURI
2018-04-02 15:18:46,357 ERROR org.apache.hadoop.yarn.util.ConverterUtils: getYarnUrlFromURI
URL:null; null;-1;null;/user/root/.sparkStaging/application_1522399820301_0020/{quote}
Log messages on YARN NM:
{quote}2018-04-02 09:36:11,958 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
Failed to parse resource-request Expected scheme name at index 0: :///user/hadoop/.sparkStaging/application_1522399820301_0006/

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message