hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohith Sharma K S <rohithsharm...@huawei.com>
Subject RE: spark job hangs/fails with localizer failing
Date Wed, 13 May 2015 04:13:57 GMT
Hi,

This looks like permission issue in secure mode,
Path /tmp/hadoop-yarn/nm-local-dir/usercache/testuser/appcache/application_1431466075462_0001 does not have desired permission.

Would you confirm the directories permissions as per below
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html#Configuration

Thanks & Regards
Rohith Sharma K S
From: Nance, Keith [mailto:knance@smartronix.com]
Sent: 13 May 2015 05:07
To: user@hadoop.apache.org
Subject: spark job hangs/fails with localizer failing

At wits end...unable to get a simple Spark Pi application to run on a Secured Yarn cluster.  Help is MUCH appreciated.
Below are the log entries for the Spark Job, Node Manager, and Resource Manager.

###: SPARK USER/JOB :###
[testuser@ip-10-10-127-10 spark]$ ./bin/spark-submit --verbose --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 1 --executor-cores 1 lib/spark-examples*.jar 10
/home/testuser/spark/conf/spark-env.sh: line 54: -Dspark.history.kerberos.principal=spark/ip-10-10-127-10.ec2.internal@MALARD.LOCAL<mailto:-Dspark.history.kerberos.principal=spark/ip-10-10-127-10.ec2.internal@MALARD.LOCAL>: No such file or directory
Using properties file: /home/testuser/spark/conf/spark-defaults.conf
Adding default property: spark.serializer=org.apache.spark.serializer.KryoSerializer
Adding default property: spark.executor.extraJavaOptions=-XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
Adding default property: spark.yarn.access.namenodes=hdfs://10.10.10.10:8020
Adding default property: spark.logConf=true
Adding default property: spark.eventlog.dir=hdfs://10.10.10.10:8020/user/testuser/spark/eventlog
Adding default property: spark.master=yarn-client
Adding default property: spark.authenticate=true
Adding default property: spark.eventlog.enabled=true
Parsed arguments:
  master                  yarn-client
  deployMode              null
  executorMemory          null
  executorCores           1
  totalExecutorCores      null
  propertiesFile          /home/testuser/spark/conf/spark-defaults.conf
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            1
  files                   null
  pyFiles                 null
  archives                null
  mainClass               org.apache.spark.examples.SparkPi
  primaryResource         file:/home/testuser/spark/lib/spark-examples-1.3.1-hadoop2.6.0.jar
  name                    org.apache.spark.examples.SparkPi
  childArgs               [10]
  jars                    null
  packages                null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
--conf and those from the properties file /home/testuser/spark/conf/spark-defaults.conf:
  spark.yarn.access.namenodes -> hdfs://10.10.10.10:8020
  spark.logConf -> true
  spark.eventlog.dir -> hdfs://10.10.10.10:8020/user/testuser/spark/eventlog
  spark.authenticate -> true
  spark.serializer -> org.apache.spark.serializer.KryoSerializer
  spark.executor.extraJavaOptions -> -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
  spark.master -> yarn-client
  spark.eventlog.enabled -> true


Main class:
org.apache.spark.examples.SparkPi
Arguments:
10
System properties:
spark.yarn.access.namenodes -> hdfs://10.10.10.10:8020
spark.executor.instances -> 1
spark.logConf -> true
spark.eventlog.dir -> hdfs://10.10.10.10:8020/user/testuser/spark/eventlog
spark.authenticate -> true
SPARK_SUBMIT -> true
spark.serializer -> org.apache.spark.serializer.KryoSerializer
spark.executor.extraJavaOptions -> -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
spark.app.name -> org.apache.spark.examples.SparkPi
spark.jars -> file:/home/testuser/spark/lib/spark-examples-1.3.1-hadoop2.6.0.jar
spark.master -> yarn-client
spark.executor.cores -> 1
spark.eventlog.enabled -> true
Classpath elements:
file:/home/testuser/spark/lib/spark-examples-1.3.1-hadoop2.6.0.jar


15/05/12 21:29:03 INFO spark.SparkContext: Running Spark version 1.3.1
15/05/12 21:29:03 INFO spark.SparkContext: Spark configuration:
spark.app.name=Spark Pi
spark.authenticate=true
spark.eventlog.dir=hdfs://10.10.10.10:8020/user/testuser/spark/eventlog
spark.eventlog.enabled=true
spark.executor.cores=1
spark.executor.extraJavaOptions=-XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
spark.executor.instances=1
spark.jars=file:/home/testuser/spark/lib/spark-examples-1.3.1-hadoop2.6.0.jar
spark.logConf=true
spark.master=yarn-client
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.yarn.access.namenodes=hdfs://10.10.10.10:8020
15/05/12 21:29:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/05/12 21:29:04 INFO spark.SecurityManager: Changing view acls to: testuser
15/05/12 21:29:04 INFO spark.SecurityManager: Changing modify acls to: testuser
15/05/12 21:29:05 INFO spark.SecurityManager: adding secret to credentials in yarn mode
15/05/12 21:29:05 INFO spark.SecurityManager: SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(testuser); users with modify permissions: Set(testuser)
15/05/12 21:29:07 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/05/12 21:29:07 INFO Remoting: Starting remoting
15/05/12 21:29:07 INFO util.Utils: Successfully started service 'sparkDriver' on port 53747.
15/05/12 21:29:07 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@ip-10-10-127-10.ec2.internal:53747]
15/05/12 21:29:07 INFO spark.SparkEnv: Registering MapOutputTracker
15/05/12 21:29:07 INFO spark.SparkEnv: Registering BlockManagerMaster
15/05/12 21:29:07 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-0877f4af-cae7-4d4c-b9f3-434712e8a654/blockmgr-f202f0b4-3842-40f5-934f-43fd764e641a
15/05/12 21:29:07 INFO storage.MemoryStore: MemoryStore started with capacity 267.3 MB
15/05/12 21:29:08 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-20236d03-2f2c-4619-bd6b-859fbd4b18b9/httpd-3ac7254f-3216-4181-8323-bb7493bfea2a
15/05/12 21:29:08 INFO spark.HttpServer: Starting HTTP Server
15/05/12 21:29:08 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/05/12 21:29:08 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:37326<mailto:SocketConnector@0.0.0.0:37326>
15/05/12 21:29:08 INFO util.Utils: Successfully started service 'HTTP file server' on port 37326.
15/05/12 21:29:08 INFO spark.SparkEnv: Registering OutputCommitCoordinator
15/05/12 21:29:08 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/05/12 21:29:08 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040<mailto:SelectChannelConnector@0.0.0.0:4040>
15/05/12 21:29:08 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/05/12 21:29:08 INFO ui.SparkUI: Started SparkUI at http://ip-10-10-127-10.ec2.internal:4040
15/05/12 21:29:09 INFO spark.SparkContext: Added JAR file:/home/testuser/spark/lib/spark-examples-1.3.1-hadoop2.6.0.jar at http://10.10.127.10:37326/jars/spark-examples-1.3.1-hadoop2.6.0.jar with timestamp 1431466149854
15/05/12 21:29:10 INFO client.RMProxy: Connecting to ResourceManager at ip-10-10-127-10.ec2.internal/10.10.127.10:8032
15/05/12 21:29:11 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
15/05/12 21:29:11 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/05/12 21:29:11 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/05/12 21:29:11 INFO yarn.Client: Setting up container launch context for our AM
15/05/12 21:29:11 INFO yarn.Client: Preparing resources for our AM container
15/05/12 21:29:13 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 16 for testuser on 10.10.10.10:8020
15/05/12 21:29:13 INFO yarn.Client: Uploading resource file:/home/testuser/spark/lib/spark-assembly-1.3.1-hadoop2.6.0.jar -> hdfs://10.10.10.10:8020/user/testuser/.sparkStaging/application_1431466075462_0001/spark-assembly-1.3.1-hadoop2.6.0.jar
15/05/12 21:29:18 INFO yarn.Client: Setting up the launch environment for our AM container
15/05/12 21:29:18 INFO spark.SecurityManager: Changing view acls to: testuser
15/05/12 21:29:18 INFO spark.SecurityManager: Changing modify acls to: testuser
15/05/12 21:29:18 INFO spark.SecurityManager: SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(testuser); users with modify permissions: Set(testuser)
15/05/12 21:29:18 INFO yarn.Client: Submitting application 1 to ResourceManager
15/05/12 21:29:21 INFO impl.YarnClientImpl: Submitted application application_1431466075462_0001
15/05/12 21:29:22 INFO yarn.Client: Application report for application_1431466075462_0001 (state: ACCEPTED)
15/05/12 21:29:22 INFO yarn.Client:
         client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1431466159129
         final status: UNDEFINED
         tracking URL: https://ip-10-10-127-10.ec2.internal:8090/proxy/application_1431466075462_0001/
         user: testuser
15/05/12 21:29:23 INFO yarn.Client: Application report for application_1431466075462_0001 (state: ACCEPTED)
15/05/12 21:29:24 INFO yarn.Client: Application report for application_1431466075462_0001 (state: ACCEPTED)
15/05/12 21:29:25 INFO yarn.Client: Application report for application_1431466075462_0001 (state: ACCEPTED)
...(TRUNCATED)...
15/05/12 21:41:13 INFO yarn.Client: Application report for application_1431466075462_0001 (state: ACCEPTED)
15/05/12 21:41:14 INFO yarn.Client: Application report for application_1431466075462_0001 (state: ACCEPTED)
15/05/12 21:41:15 INFO yarn.Client: Application report for application_1431466075462_0001 (state: ACCEPTED)
15/05/12 21:41:16 INFO yarn.Client: Application report for application_1431466075462_0001 (state: FAILED)
15/05/12 21:41:16 INFO yarn.Client:
         client token: N/A
         diagnostics: Application application_1431466075462_0001 failed 2 times due to AM Container for appattempt_1431466075462_0001_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:https://ip-10-10-127-10.ec2.internal:8090/proxy/application_1431466075462_0001/Then, click on links to logs of each attempt.
Diagnostics: Application application_1431466075462_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is testuser
main : requested yarn user is testuser
Path /tmp/hadoop-yarn/nm-local-dir/usercache/testuser/appcache/application_1431466075462_0001 does not have desired permission.
Did not create any app directories

Failing this attempt. Failing the application.
         ApplicationMaster host: N/A
        ApplicationMaster RPC port: -1
         queue: default
         start time: 1431466159129
         final status: FAILED
         tracking URL: https://ip-10-10-127-10.ec2.internal:8090/cluster/app/application_1431466075462_0001
         user: testuser
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
        at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
        at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
        at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:381)
        at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28)
        at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[testuser@ip-10-10-127-10 spark]$


####: NODEMANAGER :####
2015-05-12 21:29:22,237 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1431466075462_0001_000001 (auth:SIMPLE)
2015-05-12 21:29:22,408 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user appattempt_1431466075462_0001_000001: id: appattempt_1431466075462_0001_000001: no such user

2015-05-12 21:29:22,409 WARN org.apache.hadoop.security.UserGroupInformation: No groups available for user appattempt_1431466075462_0001_000001
2015-05-12 21:29:22,409 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for appattempt_1431466075462_0001_000001 (auth:TOKEN) for protocol=interface org.apache.hadoop.yarn.api.ContainerManagementProtocolPB
2015-05-12 21:29:22,570 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1431466075462_0001_01_000001 by user testuser
2015-05-12 21:29:22,648 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1431466075462_0001
2015-05-12 21:29:22,663 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1431466075462_0001 transitioned from NEW to INITING
2015-05-12 21:29:22,679 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1431466075462_0001 transitioned from INITING to RUNNING
2015-05-12 21:29:22,680 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=testuser     IP=10.10.127.10 OPERATION=Start Container Request       TARGET=ContainerManageImpl RESULT=SUCCESS   APPID=application_1431466075462_0001    CONTAINERID=container_1431466075462_0001_01_000001
2015-05-12 21:29:22,689 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1431466075462_0001_01_000001 to application application_1431466075462_0001
2015-05-12 21:29:22,707 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1431466075462_0001_01_000001 transitioned from NEW to LOCALIZING
2015-05-12 21:29:22,707 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1431466075462_0001
2015-05-12 21:29:22,773 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://10.10.10.10:8020/user/testuser/.sparkStaging/application_1431466075462_0001/spark-assembly-1.3.1-hadoop2.6.0.jar transitioned from INIT to DOWNLOADING
2015-05-12 21:29:22,773 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1431466075462_0001_01_000001
2015-05-12 21:29:23,306 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /tmp/hadoop-yarn/nm-local-dir/nmPrivate/container_1431466075462_0001_01_000001.tokens. Credentials list:
2015-05-12 21:41:15,528 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1431466075462_0001_000001 (auth:SIMPLE)
2015-05-12 21:41:15,535 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user appattempt_1431466075462_0001_000001: id: appattempt_1431466075462_0001_000001: no such user

2015-05-12 21:41:15,536 WARN org.apache.hadoop.security.UserGroupInformation: No groups available for user appattempt_1431466075462_0001_000001
2015-05-12 21:41:15,536 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for appattempt_1431466075462_0001_000001 (auth:TOKEN) for protocol=interface org.apache.hadoop.yarn.api.ContainerManagementProtocolPB
2015-05-12 21:41:15,539 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1431466075462_0001_01_000001
2015-05-12 21:41:15,539 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=testuser     IP=10.10.127.10 OPERATION=Stop Container Request        TARGET=ContainerManageImpl RESULT=SUCCESS   APPID=application_1431466075462_0001    CONTAINERID=container_1431466075462_0001_01_000001
2015-05-12 21:41:15,566 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1431466075462_0001_01_000001 transitioned from LOCALIZING to KILLING
2015-05-12 21:41:15,568 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1431466075462_0001_01_000001 transitioned from KILLING to DONE
2015-05-12 21:41:15,568 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1431466075462_0001_01_000001 from application application_1431466075462_0001
2015-05-12 21:41:15,569 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1431466075462_0001
2015-05-12 21:41:15,594 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1431466075462_0001_000002 (auth:SIMPLE)
2015-05-12 21:41:15,602 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user appattempt_1431466075462_0001_000002: id: appattempt_1431466075462_0001_000002: no such user

2015-05-12 21:41:15,602 WARN org.apache.hadoop.security.UserGroupInformation: No groups available for user appattempt_1431466075462_0001_000002
2015-05-12 21:41:15,603 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for appattempt_1431466075462_0001_000002 (auth:TOKEN) for protocol=interface org.apache.hadoop.yarn.api.ContainerManagementProtocolPB
2015-05-12 21:41:15,604 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1431466075462_0001_02_000001 by user testuser
2015-05-12 21:41:15,604 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1431466075462_0001_02_000001 to application application_1431466075462_0001
2015-05-12 21:41:15,605 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1431466075462_0001_02_000001 transitioned from NEW to LOCALIZING
2015-05-12 21:41:15,605 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1431466075462_0001
2015-05-12 21:41:15,605 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1431466075462_0001_02_000001
2015-05-12 21:41:15,604 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=testuser     IP=10.10.127.10 OPERATION=Start Container Request       TARGET=ContainerManageImpl RESULT=SUCCESS   APPID=application_1431466075462_0001    CONTAINERID=container_1431466075462_0001_02_000001
2015-05-12 21:41:15,620 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /tmp/hadoop-yarn/nm-local-dir/nmPrivate/container_1431466075462_0001_02_000001.tokens. Credentials list:
2015-05-12 21:41:15,672 WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code from container container_1431466075462_0001_02_000001 startLocalizer is : 255
ExitCodeException exitCode=255:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:232)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1088)
2015-05-12 21:41:15,673 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command provided 0
2015-05-12 21:41:15,673 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is testuser
2015-05-12 21:41:15,673 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : requested yarn user is testuser
2015-05-12 21:41:15,673 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Path /tmp/hadoop-yarn/nm-local-dir/usercache/testuser/appcache/application_1431466075462_0001 does not have desired permission.
2015-05-12 21:41:15,673 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Did not create any app directories
2015-05-12 21:41:15,673 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer failed
java.io.IOException: Application application_1431466075462_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is testuser
main : requested yarn user is testuser
Path /tmp/hadoop-yarn/nm-local-dir/usercache/testuser/appcache/application_1431466075462_0001 does not have desired permission.
Did not create any app directories

        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:241)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1088)
Caused by: ExitCodeException exitCode=255:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:232)
        ... 1 more
2015-05-12 21:41:15,674 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1431466075462_0001_02_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2015-05-12 21:41:15,675 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=testuser     OPERATION=Container Finished - Failed   TARGET=ContainerImpl    RESULT=FAILURE  DESCRIPTION=Container failed with state: LOCALIZATION_FAILED        APPID=application_1431466075462_0001    CONTAINERID=container_1431466075462_0001_02_000001
2015-05-12 21:41:15,675 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1431466075462_0001_02_000001 transitioned from LOCALIZATION_FAILED to DONE
2015-05-12 21:41:15,675 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1431466075462_0001_02_000001 from application application_1431466075462_0001
2015-05-12 21:41:15,675 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1431466075462_0001
2015-05-12 21:41:17,593 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1431466075462_0001_02_000001]
2015-05-12 21:41:17,594 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1431466075462_0001 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2015-05-12 21:41:17,594 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1431466075462_0001
2015-05-12 21:41:17,595 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1431466075462_0001 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED
2015-05-12 21:41:17,595 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler: Scheduling Log Deletion for application: application_1431466075462_0001, with delay of 10800 seconds
2015-05-12 21:41:17,654 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1431466075462_0001_01_000001
2015-05-12 21:41:17,654 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1431466075462_0001_02_000001
2015-05-12 21:44:48,514 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer failed
java.io.IOException: java.lang.InterruptedException
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:541)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:232)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1088)
2015-05-12 21:44:48,514 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: RESOURCE_FAILED sent to absent container container_1431466075462_0001_01_000001
[yarn@ip-10-10-128-10 hadoop]$



####: RESOURCEMANAGER :####
2015-05-12 21:29:11,624 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for testuser@MALARD.LOCAL<mailto:testuser@MALARD.LOCAL> (auth:KERBEROS)
2015-05-12 21:29:11,702 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for testuser@MALARD.LOCAL<mailto:testuser@MALARD.LOCAL> (auth:KERBEROS) for protocol=interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB
2015-05-12 21:29:11,807 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 1
2015-05-12 21:29:19,129 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 1 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2015-05-12 21:29:19,144 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 1 submitted by user testuser
2015-05-12 21:29:19,145 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testuser IP=10.10.127.10 OPERATION=Submit Application Request    TARGET=ClientRMService  RESULT=SUCCESS     APPID=application_1431466075462_0001
2015-05-12 21:29:19,535 INFO org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: application_1431466075462_0001 found existing hdfs token Kind: HDFS_DELEGATION_TOKEN, Service: 10.10.10.10:8020, Ident: (HDFS_DELEGATION_TOKEN token 16 for testuser)
2015-05-12 21:29:21,558 INFO org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: Renewed delegation-token= [Kind: HDFS_DELEGATION_TOKEN, Service: 10.10.10.10:8020, Ident: (HDFS_DELEGATION_TOKEN token 16 for testuser);exp=1431552561549], for application_1431466075462_0001
2015-05-12 21:29:21,559 INFO org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: Renew Kind: HDFS_DELEGATION_TOKEN, Service: 10.10.10.10:8020, Ident: (HDFS_DELEGATION_TOKEN token 16 for testuser);exp=1431552561549 in 86399991 ms, appId = application_1431466075462_0001
2015-05-12 21:29:21,559 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1431466075462_0001
2015-05-12 21:29:21,561 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1431466075462_0001 State change from NEW to NEW_SAVING
2015-05-12 21:29:21,575 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1431466075462_0001
2015-05-12 21:29:21,589 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1431466075462_0001 State change from NEW_SAVING to SUBMITTED
2015-05-12 21:29:21,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application added - appId: application_1431466075462_0001 user: testuser leaf-queue of parent: root #applications: 1
2015-05-12 21:29:21,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Accepted application application_1431466075462_0001 from user: testuser, in queue: default
2015-05-12 21:29:21,593 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1431466075462_0001 State change from SUBMITTED to ACCEPTED
2015-05-12 21:29:21,647 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1431466075462_0001_000001
2015-05-12 21:29:21,648 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000001 State change from NEW to SUBMITTED
2015-05-12 21:29:21,671 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1431466075462_0001 from user: testuser activated in queue: default
2015-05-12 21:29:21,671 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1431466075462_0001 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@73e84841<mailto:org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@73e84841>, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1
2015-05-12 21:29:21,671 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1431466075462_0001_000001 to scheduler from user testuser in queue default
2015-05-12 21:29:21,673 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000001 State change from SUBMITTED to SCHEDULED
2015-05-12 21:29:21,869 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1431466075462_0001_01_000001 Container Transitioned from NEW to ALLOCATED
2015-05-12 21:29:21,869 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testuser OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1431466075462_0001       CONTAINERID=container_1431466075462_0001_01_000001
2015-05-12 21:29:21,870 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1431466075462_0001_01_000001 of capacity <memory:1024, vCores:1> on host ip-10-10-128-10.ec2.internal:9032, which has 1 containers, <memory:1024, vCores:1> used and <memory:7168, vCores:7> available after allocation
2015-05-12 21:29:21,870 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: assignedContainer application attempt=appattempt_1431466075462_0001_000001 container=Container: [ContainerId: container_1431466075462_0001_01_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 clusterResource=<memory:8192, vCores:8>
2015-05-12 21:29:21,870 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting assigned queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:1024, vCores:1>, usedCapacity=0.125, absoluteUsedCapacity=0.125, numApps=1, numContainers=1
2015-05-12 21:29:21,870 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.125 absoluteUsedCapacity=0.125 used=<memory:1024, vCores:1> cluster=<memory:8192, vCores:8>
2015-05-12 21:29:21,891 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : ip-10-10-128-10.ec2.internal:9032 for container : container_1431466075462_0001_01_000001
2015-05-12 21:29:21,907 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1431466075462_0001_01_000001 Container Transitioned from ALLOCATED to ACQUIRED
2015-05-12 21:29:21,907 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1431466075462_0001_000001
2015-05-12 21:29:21,910 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1431466075462_0001 AttemptId: appattempt_1431466075462_0001_000001 MasterContainer: Container: [ContainerId: container_1431466075462_0001_01_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ]
2015-05-12 21:29:21,927 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000001 State change from SCHEDULED to ALLOCATED_SAVING
2015-05-12 21:29:21,942 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000001 State change from ALLOCATED_SAVING to ALLOCATED
2015-05-12 21:29:21,945 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1431466075462_0001_000001
2015-05-12 21:29:21,970 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1431466075462_0001_01_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ] for AM appattempt_1431466075462_0001_000001
2015-05-12 21:29:21,970 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1431466075462_0001_01_000001 : {{JAVA_HOME}}/bin/java,-server,-Xmx512m,-Djava.io.tmpdir={{PWD}}/tmp,'-Dspark.driver.host=ip-10-10-127-10.ec2.internal','-Dspark.driver.port=53747','-Dspark.driver.appUIAddress=http://ip-10-10-127-10.ec2.internal:4040','-Dspark.master=yarn-client','-Dspark.fileserver.uri=http://10.10.127.10:37326','-Dspark.executor.extraJavaOptions=-XX:+PrintGCDetails -Dkey=value -Dnumbers=\"one two three\"','-Dspark.yarn.access.namenodes=hdfs://10.10.10.10:8020','-Dspark.logConf=true','-Dspark.serializer=org.apache.spark.serializer.KryoSerializer','-Dspark.executor.id=<driver>','-Dspark.jars=file:/home/testuser/spark/lib/spark-examples-1.3.1-hadoop2.6.0.jar','-Dspark.executor.instances=1','-Dspark.app.name=Spark Pi','-Dspark.eventlog.dir=hdfs://10.10.10.10:8020/user/testuser/spark/eventlog','-Dspark.tachyonStore.folderName=spark-9481ab9a-85db-4bfe-9d2f-ceb45f31d37c','-Dspark.executor.cores=1','-Dspark.eventlog.enabled=true','-Dspark.authenticate=true',-Dspark.yarn.app.container.log.dir=<LOG_DIR>,org.apache.spark.deploy.yarn.ExecutorLauncher,--arg,'ip-10-10-127-10.ec2.internal:53747',--executor-memory,1024m,--executor-cores,1,--num-executors ,1,1>,<LOG_DIR>/stdout,2>,<LOG_DIR>/stderr
2015-05-12 21:29:21,982 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1431466075462_0001_000001
2015-05-12 21:29:21,985 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1431466075462_0001_000001
2015-05-12 21:29:22,710 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1431466075462_0001_01_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ] for AM appattempt_1431466075462_0001_000001
2015-05-12 21:29:22,710 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000001 State change from ALLOCATED to LAUNCHED
2015-05-12 21:29:22,915 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1431466075462_0001_01_000001 Container Transitioned from ACQUIRED to RUNNING
2015-05-12 21:37:55,432 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Release request cache is cleaned up
2015-05-12 21:41:15,504 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: Expired:appattempt_1431466075462_0001_000001 Timed out after 600 secs
2015-05-12 21:41:15,505 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1431466075462_0001_000001 with final state: FAILED, and exit status: -1000
2015-05-12 21:41:15,506 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000001 State change from LAUNCHED to FINAL_SAVING
2015-05-12 21:41:15,506 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1431466075462_0001_000001
2015-05-12 21:41:15,507 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1431466075462_0001_000001
2015-05-12 21:41:15,507 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000001 State change from FINAL_SAVING to FAILED
2015-05-12 21:41:15,508 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 1. The max attempts is 2
2015-05-12 21:41:15,508 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1431466075462_0001_000002
2015-05-12 21:41:15,508 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000002 State change from NEW to SUBMITTED
2015-05-12 21:41:15,508 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1431466075462_0001_000001 is done. finalState=FAILED
2015-05-12 21:41:15,510 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1431466075462_0001_01_000001 Container Transitioned from RUNNING to KILLED
2015-05-12 21:41:15,510 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1431466075462_0001_01_000001 in state: KILLED event:KILL
2015-05-12 21:41:15,510 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testuser OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1431466075462_0001       CONTAINERID=container_1431466075462_0001_01_000001
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1431466075462_0001_01_000001 of capacity <memory:1024, vCores:1> on host ip-10-10-128-10.ec2.internal:9032, which currently has 0 containers, <memory:0, vCores:0> used and <memory:8192, vCores:8> available, release resources=true
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: default used=<memory:0, vCores:0> numContainers=0 user=testuser user-resources=<memory:0, vCores:0>
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1431466075462_0001_01_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster=<memory:8192, vCores:8>
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:8192, vCores:8>
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1431466075462_0001_000001 released container container_1431466075462_0001_01_000001 on node: host: ip-10-10-128-10.ec2.internal:9032 #containers=0 available=8192 used=0 with event: KILL
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1431466075462_0001 requests cleared
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1431466075462_0001 user: testuser queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1431466075462_0001 from user: testuser activated in queue: default
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1431466075462_0001 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@5fe8d552<mailto:org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@5fe8d552>, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1
2015-05-12 21:41:15,511 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1431466075462_0001_000002 to scheduler from user testuser in queue default
2015-05-12 21:41:15,512 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000002 State change from SUBMITTED to SCHEDULED
2015-05-12 21:41:15,512 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1431466075462_0001_000001
2015-05-12 21:41:15,585 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-05-12 21:41:15,585 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1431466075462_0001_02_000001 Container Transitioned from NEW to ALLOCATED
2015-05-12 21:41:15,585 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testuser OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1431466075462_0001       CONTAINERID=container_1431466075462_0001_02_000001
2015-05-12 21:41:15,585 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1431466075462_0001_02_000001 of capacity <memory:1024, vCores:1> on host ip-10-10-128-10.ec2.internal:9032, which has 1 containers, <memory:1024, vCores:1> used and <memory:7168, vCores:7> available after allocation
2015-05-12 21:41:15,586 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: assignedContainer application attempt=appattempt_1431466075462_0001_000002 container=Container: [ContainerId: container_1431466075462_0001_02_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 clusterResource=<memory:8192, vCores:8>
2015-05-12 21:41:15,586 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting assigned queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:1024, vCores:1>, usedCapacity=0.125, absoluteUsedCapacity=0.125, numApps=1, numContainers=1
2015-05-12 21:41:15,586 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.125 absoluteUsedCapacity=0.125 used=<memory:1024, vCores:1> cluster=<memory:8192, vCores:8>
2015-05-12 21:41:15,587 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : ip-10-10-128-10.ec2.internal:9032 for container : container_1431466075462_0001_02_000001
2015-05-12 21:41:15,588 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1431466075462_0001_02_000001 Container Transitioned from ALLOCATED to ACQUIRED
2015-05-12 21:41:15,588 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1431466075462_0001_000002
2015-05-12 21:41:15,588 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1431466075462_0001 AttemptId: appattempt_1431466075462_0001_000002 MasterContainer: Container: [ContainerId: container_1431466075462_0001_02_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ]
2015-05-12 21:41:15,588 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000002 State change from SCHEDULED to ALLOCATED_SAVING
2015-05-12 21:41:15,588 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000002 State change from ALLOCATED_SAVING to ALLOCATED
2015-05-12 21:41:15,589 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1431466075462_0001_000002
2015-05-12 21:41:15,590 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1431466075462_0001_02_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ] for AM appattempt_1431466075462_0001_000002
2015-05-12 21:41:15,590 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1431466075462_0001_02_000001 : {{JAVA_HOME}}/bin/java,-server,-Xmx512m,-Djava.io.tmpdir={{PWD}}/tmp,'-Dspark.driver.host=ip-10-10-127-10.ec2.internal','-Dspark.driver.port=53747','-Dspark.driver.appUIAddress=http://ip-10-10-127-10.ec2.internal:4040','-Dspark.master=yarn-client','-Dspark.fileserver.uri=http://10.10.127.10:37326','-Dspark.executor.extraJavaOptions=-XX:+PrintGCDetails -Dkey=value -Dnumbers=\"one two three\"','-Dspark.yarn.access.namenodes=hdfs://10.10.10.10:8020','-Dspark.logConf=true','-Dspark.serializer=org.apache.spark.serializer.KryoSerializer','-Dspark.executor.id=<driver>','-Dspark.jars=file:/home/testuser/spark/lib/spark-examples-1.3.1-hadoop2.6.0.jar','-Dspark.executor.instances=1','-Dspark.app.name=Spark Pi','-Dspark.eventlog.dir=hdfs://10.10.10.10:8020/user/testuser/spark/eventlog','-Dspark.tachyonStore.folderName=spark-9481ab9a-85db-4bfe-9d2f-ceb45f31d37c','-Dspark.executor.cores=1','-Dspark.eventlog.enabled=true','-Dspark.authenticate=true',-Dspark.yarn.app.container.log.dir=<LOG_DIR>,org.apache.spark.deploy.yarn.ExecutorLauncher,--arg,'ip-10-10-127-10.ec2.internal:53747',--executor-memory,1024m,--executor-cores,1,--num-executors ,1,1>,<LOG_DIR>/stdout,2>,<LOG_DIR>/stderr
2015-05-12 21:41:15,590 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1431466075462_0001_000002
2015-05-12 21:41:15,590 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1431466075462_0001_000002
2015-05-12 21:41:15,607 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1431466075462_0001_02_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ] for AM appattempt_1431466075462_0001_000002
2015-05-12 21:41:15,607 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000002 State change from ALLOCATED to LAUNCHED
2015-05-12 21:41:16,590 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-05-12 21:41:16,590 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1431466075462_0001_02_000001 Container Transitioned from ACQUIRED to COMPLETED
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1431466075462_0001_02_000001 in state: COMPLETED event:FINISHED
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testuser OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1431466075462_0001       CONTAINERID=container_1431466075462_0001_02_000001
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1431466075462_0001_02_000001 of capacity <memory:1024, vCores:1> on host ip-10-10-128-10.ec2.internal:9032, which currently has 0 containers, <memory:0, vCores:0> used and <memory:8192, vCores:8> available, release resources=true
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: default used=<memory:0, vCores:0> numContainers=0 user=testuser user-resources=<memory:0, vCores:0>
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1431466075462_0001_02_000001, NodeId: ip-10-10-128-10.ec2.internal:9032, NodeHttpAddress: ip-10-10-128-10.ec2.internal:8090, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.128.10:9032 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster=<memory:8192, vCores:8>
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:8192, vCores:8>
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0
2015-05-12 21:41:16,591 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1431466075462_0001_000002 released container container_1431466075462_0001_02_000001 on node: host: ip-10-10-128-10.ec2.internal:9032 #containers=0 available=8192 used=0 with event: FINISHED
2015-05-12 21:41:16,600 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1431466075462_0001_000002 with final state: FAILED, and exit status: -1000
2015-05-12 21:41:16,601 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000002 State change from LAUNCHED to FINAL_SAVING
2015-05-12 21:41:16,601 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1431466075462_0001_000002
2015-05-12 21:41:16,601 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1431466075462_0001_000002
2015-05-12 21:41:16,601 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431466075462_0001_000002 State change from FINAL_SAVING to FAILED
2015-05-12 21:41:16,601 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 2. The max attempts is 2
2015-05-12 21:41:16,602 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1431466075462_0001 with final state: FAILED
2015-05-12 21:41:16,602 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1431466075462_0001 State change from ACCEPTED to FINAL_SAVING
2015-05-12 21:41:16,602 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1431466075462_0001
2015-05-12 21:41:16,603 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1431466075462_0001_000002 is done. finalState=FAILED
2015-05-12 21:41:16,603 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1431466075462_0001 requests cleared
2015-05-12 21:41:16,603 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1431466075462_0001 user: testuser queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2015-05-12 21:41:16,603 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1431466075462_0001 failed 2 times due to AM Container for appattempt_1431466075462_0001_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:https://ip-10-10-127-10.ec2.internal:8090/proxy/application_1431466075462_0001/Then, click on links to logs of each attempt.
Diagnostics: Application application_1431466075462_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is testuser
main : requested yarn user is testuser
Path /tmp/hadoop-yarn/nm-local-dir/usercache/testuser/appcache/application_1431466075462_0001 does not have desired permission.
Did not create any app directories

Failing this attempt. Failing the application.
2015-05-12 21:41:16,604 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1431466075462_0001 State change from FINAL_SAVING to FAILED
2015-05-12 21:41:16,605 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1431466075462_0001 user: testuser leaf-queue of parent: root #applications: 0
2015-05-12 21:41:16,605 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testuser OPERATION=Application Finished - Failed TARGET=RMAppManager     RESULT=FAILURE  DESCRIPTION=App failed with state: FAILED  PERMISSIONS=Application application_1431466075462_0001 failed 2 times due to AM Container for appattempt_1431466075462_0001_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:https://ip-10-10-127-10.ec2.internal:8090/proxy/application_1431466075462_0001/Then, click on links to logs of each attempt.
Diagnostics: Application application_1431466075462_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is testuser
main : requested yarn user is testuser
Path /tmp/hadoop-yarn/nm-local-dir/usercache/testuser/appcache/application_1431466075462_0001 does not have desired permission.
Did not create any app directories

Failing this attempt. Failing the application.  APPID=application_1431466075462_0001
2015-05-12 21:41:16,607 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1431466075462_0001,name=Spark Pi,user=testuser,queue=default,state=FAILED,trackingUrl=https://ip-10-10-127-10.ec2.internal:8090/cluster/app/application_1431466075462_0001,appMasterHost=N/A,startTime=1431466159129,finishTime=1431466876602,finalStatus=FAILED
2015-05-12 21:41:16,629 INFO org.apache.hadoop.hdfs.DFSClient: Cancelling HDFS_DELEGATION_TOKEN token 16 for testuser on 10.10.10.10:8020
2015-05-12 21:41:17,593 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-05-12 21:41:17,594 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-05-12 21:41:18,597 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
[yarn@ip-10-10-127-10 hadoop]$

Keith Nance
Sr. Software Engineer
*Email: knance@smartronix.com<pbouton@smartronix.com>
*    Cell: 808-343-0071
[cid:image002.jpg@01CA58DB.D44B0990]<http://www.smartronix.com/>
www.smartronix.com<http://www.smartronix.com>


Mime
View raw message