hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billie Rinaldi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-9190) [Submarine] Submarine job will fail to run as a first job on a new created Hadoop 3.2.0 RC1 cluster
Date Fri, 11 Jan 2019 15:47:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740499#comment-16740499

Billie Rinaldi commented on YARN-9190:

[~tangzhankun] The problem is that the submarine job launch doesn't know where the service
AM dependency jars are located. The dependencies are specified using a service.libdir system
property: https://github.com/apache/hadoop/blob/release-3.2.0-RC1/hadoop-yarn-project/hadoop-yarn/bin/yarn#L78-L85.
When running the yarn app -launch command or launching an app through the REST API, the system
property is already configured, but since submarine is using the Java API the system property
is not automatically set. An alternative to setting the system property is to run yarn app
-enableFastLaunch which will upload the dependency tarball. The tarball will also be uploaded
if you launch a job as the yarn user, as you discovered.

To make it so that the dependency tarball doesn't need to be uploaded in advance, we should
either set the service.libdir property for the submarine job launch, or possibly we could
consider setting it for all yarn jar commands.

> [Submarine] Submarine job will fail to run as a first job on a new created Hadoop 3.2.0
RC1 cluster
> ---------------------------------------------------------------------------------------------------
>                 Key: YARN-9190
>                 URL: https://issues.apache.org/jira/browse/YARN-9190
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Zhankun Tang
>            Assignee: Sunil Govindan
>            Priority: Minor
> This issue was found when verifying submarine in Hadoop 3.2.0 RC1 planning. The reproduce
steps are:
>  # Init a new HDFS and YARN (LinuxContainerExecutor and Docker enabled)
>  # Before run any other yarn service job, use yarn user to submit a submarine job
> The job will fail with below error:
> {code:java}
> LogType:serviceam-err.txt
> LogLastModifiedTime:Thu Jan 10 21:15:23 +0800 2019
> LogLength:86
> LogContents:
> Error: Could not find or load main class org.apache.hadoop.yarn.service.ServiceMaster
> End of LogType:serviceam-err.txt
> {code}
> This seems because the dependencies are not ready as the service client reported:
> {code:java}
> 2019-01-10 21:50:47,380 WARN client.ServiceClient: Property yarn.service.framework.path
has a value /yarn-services/3.2.0/service-dep.tar.gz, but is not a valid file
> 2019-01-10 21:50:47,381 INFO client.ServiceClient: Uploading all dependency jars to HDFS.
For faster submission of apps, set config property yarn.service.framework.path to the dependency
tarball location. Dependency tarball can be uploaded to any HDFS path directly or by using
command: yarn app -enableFastLaunch [<Destination Folder>]{code}
> When this error happens, I found that there is no “/yarn-services” directory created
in HDFS.
> But after I run “yarn app -launch my-sleeper sleeper”, the “/yarn-services”
created in HDFS and then the submarine job can run successfully.
> {code:java}
> yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ hdfs dfs -ls
> -rwxr-xr-x 1 yarn supergroup 93596476 2019-01-11 08:23 /yarn-services/3.2.0/service-dep.tar.gz{code}
> It seems an issue of yarn service in 3.2.0 RC1 and I files this Jira to track it.
> And verified that trunk branch doesn't have this issue.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message