hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulul <had...@ulul.org>
Subject Re: Yarn AM is abending job when submitting a remote job to cluster
Date Sun, 22 Feb 2015 02:49:56 GMT
Hi Roland

I tried to reproduce your problem with a single node setup submitting a 
job to a remote cluster (please note I'm an HDP user, it's a sandbox 
submitting to a 3 VMs cluster)
It worked like a charm...
I run into problems when submitting the job from another user but with a 
permission problem, it does not look like your AMRMToken problem.

We are probably submitting our jobs differently though. I use hadoop jar 
--config <conf dir>, you seem to be using something different since you 
have the -conf generic option

Would you please share your job command ?

Ulul
Le 20/02/2015 03:09, Roland DePratti a écrit :
>
> Xuan,
>
> Thanks for asking. Here is the RM log. It almost looks like the log 
> completes successfully (see red highlighting).
>
> 2015-02-19 19:55:43,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: 
> Allocated new applicationId: 12
> 2015-02-19 19:55:44,659 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: 
> Application with id 12 submitted by user cloudera
> 2015-02-19 19:55:44,659 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing 
> application with id application_1424003606313_0012
> 2015-02-19 19:55:44,659 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: 
> USER=cloudera    IP=192.168.2.185    OPERATION=Submit Application 
> Request    TARGET=ClientRMService RESULT=SUCCESS    
> APPID=application_1424003606313_0012
> 2015-02-19 19:55:44,659 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1424003606313_0012 State change from NEW to NEW_SAVING
> 2015-02-19 19:55:44,659 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: 
> Storing info for app: application_1424003606313_0012
> 2015-02-19 19:55:44,660 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1424003606313_0012 State change from NEW_SAVING to SUBMITTED
> 2015-02-19 19:55:44,666 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Accepted application application_1424003606313_0012 from user: 
> cloudera, in queue: default, currently num of applications: 1
> 2015-02-19 19:55:44,667 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1424003606313_0012 State change from SUBMITTED to ACCEPTED
> 2015-02-19 19:55:44,667 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering 
> app attempt : appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:44,667 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000001 State change from NEW to SUBMITTED
> 2015-02-19 19:55:44,667 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Added Application Attempt appattempt_1424003606313_0012_000001 to 
> scheduler from user: cloudera
> 2015-02-19 19:55:44,669 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000001 State change from SUBMITTED to 
> SCHEDULED
> 2015-02-19 19:55:50,671 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_01_000001 Container Transitioned from NEW 
> to ALLOCATED
> 2015-02-19 19:55:50,671 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: 
> USER=cloudera    OPERATION=AM Allocated Container 
> TARGET=SchedulerApp    RESULT=SUCCESS 
> APPID=application_1424003606313_0012 
> CONTAINERID=container_1424003606313_0012_01_000001
> 2015-02-19 19:55:50,671 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
> Assigned container container_1424003606313_0012_01_000001 of capacity 
> <memory:1024, vCores:1> on host hadoop0.rdpratti.com:8041, which has 1 
> containers, <memory:1024, vCores:1> used and <memory:433, vCores:1> 
> available after allocation
> 2015-02-19 19:55:50,672 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: 
> Sending NMToken for nodeId : hadoop0.rdpratti.com:8041 for container : 
> container_1424003606313_0012_01_000001
> 2015-02-19 19:55:50,672 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_01_000001 Container Transitioned from 
> ALLOCATED to ACQUIRED
> 2015-02-19 19:55:50,673 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: 
> Clear node set for appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,673 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Storing attempt: AppId: application_1424003606313_0012 AttemptId: 
> appattempt_1424003606313_0012_000001 MasterContainer: Container: 
> [ContainerId: container_1424003606313_0012_01_000001, NodeId: 
> hadoop0.rdpratti.com:8041, NodeHttpAddress: hadoop0.rdpratti.com:8042, 
> Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: 
> ContainerToken, service: 192.168.2.253:8041 }, ]
> 2015-02-19 19:55:50,673 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000001 State change from SCHEDULED to 
> ALLOCATED_SAVING
> 2015-02-19 19:55:50,673 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000001 State change from 
> ALLOCATED_SAVING to ALLOCATED
> 2015-02-19 19:55:50,673 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Launching masterappattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,674 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Setting up container Container: [ContainerId: 
> container_1424003606313_0012_01_000001, NodeId: 
> hadoop0.rdpratti.com:8041, NodeHttpAddress: hadoop0.rdpratti.com:8042, 
> Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: 
> ContainerToken, service: 192.168.2.253:8041 }, ] for AM 
> appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,675 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Command to launch container container_1424003606313_0012_01_000001 : 
> $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=<LOG_DIR> 
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Djava.net.preferIPv4Stack=true -Xmx209715200 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 
> 2><LOG_DIR>/stderr
> 2015-02-19 19:55:50,675 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
> Create AMRMToken for ApplicationAttempt: 
> appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,675 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
> Creating password for appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,688 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Done launching container Container: [ContainerId: 
> container_1424003606313_0012_01_000001, NodeId: 
> hadoop0.rdpratti.com:8041, NodeHttpAddress: hadoop0.rdpratti.com:8042, 
> Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: 
> ContainerToken, service: 192.168.2.253:8041 }, ] for AM 
> appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,688 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000001 State change from ALLOCATED to 
> LAUNCHED
> 2015-02-19 19:55:50,928 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_01_000001 Container Transitioned from 
> ACQUIRED to RUNNING
> 2015-02-19 19:55:57,941 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_01_000001 Container Transitioned from 
> RUNNING to COMPLETED
> 2015-02-19 19:55:57,941 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
> Completed container: container_1424003606313_0012_01_000001 in state: 
> COMPLETED event:FINISHED
> 2015-02-19 19:55:57,942 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: 
> USER=cloudera    OPERATION=AM Released Container 
> TARGET=SchedulerApp    RESULT=SUCCESS 
> APPID=application_1424003606313_0012 
> CONTAINERID=container_1424003606313_0012_01_000001
> 2015-02-19 19:55:57,942 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
> Released container container_1424003606313_0012_01_000001 of capacity 
> <memory:1024, vCores:1> on host hadoop0.rdpratti.com:8041, which 
> currently has 0 containers, <memory:0, vCores:0> used and 
> <memory:1457, vCores:2> available, release resources=true
> 2015-02-19 19:55:57,942 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Application attempt appattempt_1424003606313_0012_000001 released 
> container container_1424003606313_0012_01_000001 on node: host: 
> hadoop0.rdpratti.com:8041 #containers=0 available=1457 used=0 with 
> event: FINISHED
> 2015-02-19 19:55:57,942 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Updating application attempt appattempt_1424003606313_0012_000001 with 
> final state: FAILED, and exit status: 1
> 2015-02-19 19:55:57,942 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000001 State change from LAUNCHED to 
> FINAL_SAVING
> 2015-02-19 19:55:57,942 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering 
> app attempt : appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
> Application finished, removing password for 
> appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000001 State change from FINAL_SAVING to 
> FAILED
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Application appattempt_1424003606313_0012_000001 is done. 
> finalState=FAILED
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering 
> app attempt : appattempt_1424003606313_0012_000002
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: 
> Application application_1424003606313_0012 requests cleared
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000002 State change from NEW to SUBMITTED
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Added Application Attempt appattempt_1424003606313_0012_000002 to 
> scheduler from user: cloudera
> 2015-02-19 19:55:57,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000002 State change from SUBMITTED to 
> SCHEDULED
> 2015-02-19 19:55:58,941 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Null container completed...
> 2015-02-19 19:56:03,950 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_02_000001 Container Transitioned from NEW 
> to ALLOCATED
> 2015-02-19 19:56:03,950 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: 
> USER=cloudera    OPERATION=AM Allocated Container 
> TARGET=SchedulerApp    RESULT=SUCCESS 
> APPID=application_1424003606313_0012 
> CONTAINERID=container_1424003606313_0012_02_000001
> 2015-02-19 19:56:03,950 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
> Assigned container container_1424003606313_0012_02_000001 of capacity 
> <memory:1024, vCores:1> on host hadoop0.rdpratti.com:8041, which has 1 
> containers, <memory:1024, vCores:1> used and <memory:433, vCores:1> 
> available after allocation
> 2015-02-19 19:56:03,950 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: 
> Sending NMToken for nodeId : hadoop0.rdpratti.com:8041 for container : 
> container_1424003606313_0012_02_000001
> 2015-02-19 19:56:03,951 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_02_000001 Container Transitioned from 
> ALLOCATED to ACQUIRED
> 2015-02-19 19:56:03,951 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: 
> Clear node set for appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,951 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Storing attempt: AppId: application_1424003606313_0012 AttemptId: 
> appattempt_1424003606313_0012_000002 MasterContainer: Container: 
> [ContainerId: container_1424003606313_0012_02_000001, NodeId: 
> hadoop0.rdpratti.com:8041, NodeHttpAddress: hadoop0.rdpratti.com:8042, 
> Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: 
> ContainerToken, service: 192.168.2.253:8041 }, ]
> 2015-02-19 19:56:03,952 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000002 State change from SCHEDULED to 
> ALLOCATED_SAVING
> 2015-02-19 19:56:03,952 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000002 State change from 
> ALLOCATED_SAVING to ALLOCATED
> 2015-02-19 19:56:03,952 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Launching masterappattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,953 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Setting up container Container: [ContainerId: 
> container_1424003606313_0012_02_000001, NodeId: 
> hadoop0.rdpratti.com:8041, NodeHttpAddress: hadoop0.rdpratti.com:8042, 
> Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: 
> ContainerToken, service: 192.168.2.253:8041 }, ] for AM 
> appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,953 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Command to launch container container_1424003606313_0012_02_000001 : 
> $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=<LOG_DIR> 
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Djava.net.preferIPv4Stack=true -Xmx209715200 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 
> 2><LOG_DIR>/stderr
> 2015-02-19 19:56:03,953 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
> Create AMRMToken for ApplicationAttempt: 
> appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,953 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
> Creating password for appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,974 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
> Done launching container Container: [ContainerId: 
> container_1424003606313_0012_02_000001, NodeId: 
> hadoop0.rdpratti.com:8041, NodeHttpAddress: hadoop0.rdpratti.com:8042, 
> Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: 
> ContainerToken, service: 192.168.2.253:8041 }, ] for AM 
> appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,974 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000002 State change from ALLOCATED to 
> LAUNCHED
> 2015-02-19 19:56:04,947 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_02_000001 Container Transitioned from 
> ACQUIRED to RUNNING
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1424003606313_0012_02_000001 Container Transitioned from 
> RUNNING to COMPLETED
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
> Completed container: container_1424003606313_0012_02_000001 in state: 
> COMPLETED event:FINISHED
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: 
> USER=cloudera    OPERATION=AM Released Container 
> TARGET=SchedulerApp    RESULT=SUCCESS 
> APPID=application_1424003606313_0012 
> CONTAINERID=container_1424003606313_0012_02_000001
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
> Released container container_1424003606313_0012_02_000001 of capacity 
> <memory:1024, vCores:1> on host hadoop0.rdpratti.com:8041, which 
> currently has 0 containers, <memory:0, vCores:0> used and 
> <memory:1457, vCores:2> available, release resources=true
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Updating application attempt appattempt_1424003606313_0012_000002 with 
> final state: FAILED, and exit status: 1
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Application attempt appattempt_1424003606313_0012_000002 released 
> container container_1424003606313_0012_02_000001 on node: host: 
> hadoop0.rdpratti.com:8041 #containers=0 available=1457 used=0 with 
> event: FINISHED
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000002 State change from LAUNCHED to 
> FINAL_SAVING
> 2015-02-19 19:56:10,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering 
> app attempt : appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:10,957 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
> Application finished, removing password for 
> appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:10,957 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1424003606313_0012_000002 State change from FINAL_SAVING to 
> FAILED
> 2015-02-19 19:56:10,957 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> Updating application application_1424003606313_0012 with final state: 
> FAILED
> 2015-02-19 19:56:10,957 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1424003606313_0012 State change from ACCEPTED to FINAL_SAVING
> 2015-02-19 19:56:10,957 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: 
> Updating info for app: application_1424003606313_0012
> 2015-02-19 19:56:10,957 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Application appattempt_1424003606313_0012_000002 is done. 
> finalState=FAILED
> 2015-02-19 19:56:10,957 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: 
> Application application_1424003606313_0012 requests cleared
> 2015-02-19 19:56:10,990 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> Application application_1424003606313_0012 failed 2 times due to AM 
> Container for appattempt_1424003606313_0012_000002 exited with  
> exitCode: 1 due to: Exception from container-launch.
> Container id: container_1424003606313_0012_02_000001
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>     at org.apache.hadoop.util.Shell.run(Shell.java:455)
>     at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>     at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
>
> Container exited with a non-zero exit code 1
> .Failing this attempt.. Failing the application.
> 2015-02-19 19:56:10,990 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1424003606313_0012 State change from FINAL_SAVING to FAILED
> 2015-02-19 19:56:10,991 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: 
> USER=cloudera    OPERATION=Application Finished - Failed 
> TARGET=RMAppManager    RESULT=FAILURE    DESCRIPTION=App failed with 
> state: FAILED    PERMISSIONS=Application 
> application_1424003606313_0012 failed 2 times due to AM Container for 
> appattempt_1424003606313_0012_000002 exited with  exitCode: 1 due to: 
> Exception from container-launch.
> Container id: container_1424003606313_0012_02_000001
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>     at org.apache.hadoop.util.Shell.run(Shell.java:455)
>     at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>     at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
> *From:*Xuan Gong [mailto:xgong@hortonworks.com]
> *Sent:* Thursday, February 19, 2015 8:23 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Yarn AM is abending job when submitting a remote job to 
> cluster
>
> Hey, Roland:
>
>   Could you also check the RM logs for this application, please ? 
> Maybe we could find something there.
>
> Thanks
>
> Xuan Gong
>
> *From: *Roland DePratti <roland.depratti@cox.net 
> <mailto:roland.depratti@cox.net>>
> *Reply-To: *"user@hadoop.apache.org <mailto:user@hadoop.apache.org>" 
> <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> *Date: *Thursday, February 19, 2015 at 5:11 PM
> *To: *"user@hadoop.apache.org <mailto:user@hadoop.apache.org>" 
> <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> *Subject: *RE: Yarn AM is abending job when submitting a remote job to 
> cluster
>
> No, I hear you.
>
> I was just stating that the fact that hdfs works, there is something 
> right about the connectivity, that’s all, i.e. Server is reachable, 
> hadoop was able to process the request – but like you said, doesn’t 
> mean yarn works.
>
> I tried both your solution and Alex’s solution unfortunately without 
> any improvement.
>
> Here is the command I am executing:
>
> hadoop jar avgWordlength.jar  solution.AvgWordLength -conf 
> ~/conf/hadoop-cluster.xml /user/cloudera/shakespeare wordlength4
>
> Here is the new hadoop-cluseter.xml
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <!--generated by Roland-->
> <configuration>
>   <property>
>     <name>fs.defaultFS</name>
> <value>hdfs://hadoop0.rdpratti.com:8020</value>
>   </property>
>   <property>
> <name>mapreduce.jobtracker.address</name>
>     <value>hadoop0.rdpratti.com:8032</value>
>   </property>
>   <property>
> <name>yarn.resourcemanager.address</name>
>     <value>hadoop0.rdpratti.com:8032</value>
>   </property>
>
>
> I also deleted the .staging directory under the submitting user. Plus 
> restarted Job History Server.
>
> Resubmitted the job with the same result. Here is the log:
>
> 2015-02-19 19:56:05,061 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:05,468 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
> 2015-02-19 19:56:05,471 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2015-02-19 19:56:05,471 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
> 2015-02-19 19:56:05,473 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
> 2015-02-19 19:56:05,476 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
> 2015-02-19 19:56:05,490 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2015-02-19 19:56:05,621 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2015-02-19 19:56:05,621 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@3909f88f  <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@3909f88f>)
> 2015-02-19 19:56:05,684 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
> 2015-02-19 19:56:05,923 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
> 2015-02-19 19:56:05,925 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2015-02-19 19:56:05,929 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
> 2015-02-19 19:56:05,930 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
> 2015-02-19 19:56:05,934 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
> 2015-02-19 19:56:05,958 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2015-02-19 19:56:06,529 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2015-02-19 19:56:06,719 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
> 2015-02-19 19:56:06,837 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
> 2015-02-19 19:56:06,881 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
> 2015-02-19 19:56:06,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
> 2015-02-19 19:56:06,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
> 2015-02-19 19:56:06,883 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
> 2015-02-19 19:56:06,884 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2015-02-19 19:56:06,885 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
> 2015-02-19 19:56:06,885 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
> 2015-02-19 19:56:06,886 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
> 2015-02-19 19:56:06,899 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Recovery is enabled. Will try to recover from previous life on best effort basis.
> 2015-02-19 19:56:06,918 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history file is at hdfs://hadoop0.rdpratti.com:8020/user/cloudera/.staging/job_1424003606313_0012/job_1424003606313_0012_1.jhist
> 2015-02-19 19:56:07,377 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Read completed tasks from history 0
> 2015-02-19 19:56:07,423 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
> 2015-02-19 19:56:07,453 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2015-02-19 19:56:07,507 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2015-02-19 19:56:07,507 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
> 2015-02-19 19:56:07,515 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1424003606313_0012 to jobTokenSecretManager
> 2015-02-19 19:56:07,536 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1424003606313_0012 because: not enabled; too much RAM;
> 2015-02-19 19:56:07,555 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1424003606313_0012 = 5343207. Number of splits = 5
> 2015-02-19 19:56:07,557 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1424003606313_0012 = 1
> 2015-02-19 19:56:07,557 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1424003606313_0012Job Transitioned from NEW to INITED
> 2015-02-19 19:56:07,558 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1424003606313_0012.
> 2015-02-19 19:56:07,618 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
> 2015-02-19 19:56:07,630 INFO [Socket Reader #1 for port 46841] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 46841
> 2015-02-19 19:56:07,648 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
> 2015-02-19 19:56:07,648 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2015-02-19 19:56:07,649 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at hadoop0.rdpratti.com/192.168.2.253:46841
> 2015-02-19 19:56:07,650 INFO [IPC Server listener on 46841] org.apache.hadoop.ipc.Server: IPC Server listener on 46841: starting
> 2015-02-19 19:56:07,721 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
> 2015-02-19 19:56:07,727 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
> 2015-02-19 19:56:07,739 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
> 2015-02-19 19:56:07,745 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
> 2015-02-19 19:56:07,745 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
> 2015-02-19 19:56:07,749 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
> 2015-02-19 19:56:07,749 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
> 2015-02-19 19:56:07,760 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 39939
> 2015-02-19 19:56:07,760 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
> 2015-02-19 19:56:07,789 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/jars/hadoop-yarn-common-2.5.0-cdh5.3.0.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_39939_mapreduce____.o5qk0w/webapp
> 2015-02-19 19:56:08,156 INFO [main] org.mortbay.log: StartedHttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:39939  <mailto:HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:39939>
> 2015-02-19 19:56:08,157 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 39939
> 2015-02-19 19:56:08,629 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
> 2015-02-19 19:56:08,634 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
> 2015-02-19 19:56:08,635 INFO [Socket Reader #1 for port 43858] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 43858
> 2015-02-19 19:56:08,639 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2015-02-19 19:56:08,642 INFO [IPC Server listener on 43858] org.apache.hadoop.ipc.Server: IPC Server listener on 43858: starting
> 2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
> 2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
> 2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
> 2015-02-19 19:56:08,797 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
> 2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
> 2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
> 2015-02-19 19:56:08,799 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
> 2015-02-19 19:56:08,809 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2015-02-19 19:56:08,821 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/192.168.2.185:8030
> 2015-02-19 19:56:08,975 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> 2015-02-19 19:56:08,976 WARN [main] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> 2015-02-19 19:56:08,976 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> 2015-02-19 19:56:08,981 ERROR [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while registering
> org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>          at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>          at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>          at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>          at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>          at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>          at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
>          at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>          at java.lang.reflect.Method.invoke(Method.java:606)
>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>          at com.sun.proxy.$Proxy36.registerApplicationMaster(Unknown Source)
>          at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:161)
>          at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
>          at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:238)
>          at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>          at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:807)
>          at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>          at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>          at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1075)
>          at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>          at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1478)
>          at java.security.AccessController.doPrivileged(Native Method)
>          at javax.security.auth.Subject.doAs(Subject.java:415)
>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>          at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1474)
>          at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>          at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>          at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>          at com.sun.proxy.$Proxy35.registerApplicationMaster(Unknown Source)
>          at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
>          ... 22 more
> 2015-02-19 19:56:08,983 INFO [main] org.apache.hadoop.service.AbstractService: Service RMCommunicator failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
> *From:*Ulul [mailto:hadoop@ulul.org]
> *Sent:* Thursday, February 19, 2015 5:08 PM
> *To:* user@hadoop.apache.org <mailto:user@hadoop.apache.org>
> *Subject:* Re: Yarn AM is abending job when submitting a remote job to 
> cluster
>
> Is your point is that using the hdfs:// prefix is valid since our hdfs 
> client works ?
> fs.defaultFS defines the namenode address and the filesystem type. It 
> doen't imply that the prefix should be used for yarn and mapreduce 
> options that are not directly linked to hdfs
>
>
> Le 19/02/2015 22:56, Ulul a écrit :
>
>     In that case it's just between your hdfs client, the NN and the
>     DNs, no YARN or MR component involved.
>     The fact that this works is not related to your MR job not succeeding.
>
>
>     Le 19/02/2015 22:45, roland.depratti a écrit :
>
>         Thanks for looking at my problem.
>
>         I can run an hdfs command from the client, with the config
>         file listed, that does a cat on a file in hdfs on the remote
>         cluster and returns the contents of that file to the client.
>
>         - rd
>
>         Sent from my Verizon Wireless 4G LTE smartphone
>
>
>
>         -------- Original message --------
>         From: Ulul <hadoop@ulul.org> <mailto:hadoop@ulul.org>
>         Date:02/19/2015 4:03 PM (GMT-05:00)
>         To: user@hadoop.apache.org <mailto:user@hadoop.apache.org>
>         Subject: Re: Yarn AM is abending job when submitting a remote
>         job to cluster
>
>         Hi
>         Doesn't seem like an ssl error to me (the log states that
>         attempts to
>         override final properties are ignored)
>
>         On the other hand the configuration seems wrong
>         :mapreduce.jobtracker.address and yarn.resourcemanager.address
>         should
>         only contain an IP or a hostname. You should remove 'hdfs://'
>         though the
>         log doesn't suggest it has anything to do with your problem....
>
>         And what do you mean by an "HDFS job" ?
>
>         Ulul
>
>         Le 19/02/2015 04:22, daemeon reiydelle a écrit :
>         > I would guess you do not have your ssl certs set up, client
>         or server,
>         > based on the error.
>         >
>         > ***
>         > .......
>         > ***“Life should not be a journey to the grave with the
>         intention of
>         > arriving safely in a
>         > pretty and well preserved body, but rather to skid in
>         broadside in a
>         > cloud of smoke,
>         > thoroughly used up, totally worn out, and loudly proclaiming
>         “Wow!
>         > What a Ride!”*
>         > - Hunter Thompson
>         >
>         > Daemeon C.M. Reiydelle
>         > USA (+1) 415.501.0198
>         > London (+44) (0) 20 8144 9872*/
>         > /
>         >
>         > On Wed, Feb 18, 2015 at 5:19 PM, Roland DePratti
>         > <roland.depratti@cox.net <mailto:roland.depratti@cox.net>
>         <mailto:roland.depratti@cox.net>
>         <mailto:roland.depratti@cox.net>> wrote:
>         >
>         >     I have been searching for a handle on a problem without very
>         >     little clues. Any help pointing me to the right
>         direction will be
>         >     huge.
>         >
>         >     I have not received any input form the Cloudera google
>         groups.
>         >     Perhaps this is more Yarn based and I am hoping I have
>         more luck here.
>         >
>         >     Any help is greatly appreciated.
>         >
>         >     I am running a Hadoop cluster using CDH5.3. I also have
>         a client
>         >     machine with a standalone one node setup (VM).
>         >
>         >     All environments are running CentOS 6.6.
>         >
>         >     I have submitted some Java mapreduce jobs locally on
>         both the
>         >     cluster and the standalone environment with successfully
>         completions.
>         >
>         >     I can submit a remote HDFS job from client to cluster
>         using -conf
>         >     hadoop-cluster.xml (see below) and get data back from
>         the cluster
>         >     with no problem.
>         >
>         >     When submitted remotely the mapreduce jobs remotely, I
>         get an AM
>         >     error:
>         >
>         >     AM fails the job with the error:
>         >
>         >
>         >                SecretManager$InvalidToken:
>         >     appattempt_1424003606313_0001_000002 not found in
>         >     AMRMTokenSecretManager
>         >
>         >
>         >     I searched /var/log/secure on the client and cluster with no
>         >     unusual messages.
>         >
>         >     Here is the contents of hadoop-cluster.xml:
>         >
>         >     <?xml version="1.0" encoding="UTF-8"?>
>         >
>         >     <!--generated by Roland-->
>         >     <configuration>
>         >       <property>
>         >         <name>fs.defaultFS</name>
>         > <value>hdfs://mycluser:8020</value>
>         >       </property>
>         >       <property>
>         > <name>mapreduce.jobtracker.address</name>
>         > <value>hdfs://mycluster:8032</value>
>         >       </property>
>         >       <property>
>         > <name>yarn.resourcemanager.address</name>
>         > <value>hdfs://mycluster:8032</value>
>         >       </property>
>         >
>         >     Here is the output from the job log on the cluster:
>         >
>         >     2015-02-15 07:51:06,544 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created
>         >     MRAppMaster for application
>         appattempt_1424003606313_0001_000002
>         >
>         >     2015-02-15 07:51:06,949 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         hadoop.ssl.require.client.cert;  Ignoring.
>         >
>         >     2015-02-15 07:51:06,952 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         > mapreduce.job.end-notification.max.retry.interval; Ignoring.
>         >
>         >     2015-02-15 07:51:06,952 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter: hadoop.ssl.client.conf;  Ignoring.
>         >
>         >     2015-02-15 07:51:06,954 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         hadoop.ssl.keystores.factory.class;
>         >     Ignoring.
>         >
>         >     2015-02-15 07:51:06,957 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter: hadoop.ssl.server.conf;  Ignoring.
>         >
>         >     2015-02-15 07:51:06,973 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         >     mapreduce.job.end-notification.max.attempts; Ignoring.
>         >
>         >     2015-02-15 07:51:07,241 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing
>         with tokens:
>         >
>         >     2015-02-15 07:51:07,241 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
>         >     YARN_AM_RM_TOKEN, Service: , Ident:
>         >    
>         (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0
>         <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0>)
>         >
>         >     2015-02-15 07:51:07,332 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred
>         >     newApiCommitter.
>         >
>         >     2015-02-15 07:51:07,627 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         hadoop.ssl.require.client.cert;  Ignoring.
>         >
>         >     2015-02-15 07:51:07,632 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         > mapreduce.job.end-notification.max.retry.interval; Ignoring.
>         >
>         >     2015-02-15 07:51:07,632 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter: hadoop.ssl.client.conf;  Ignoring.
>         >
>         >     2015-02-15 07:51:07,639 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         hadoop.ssl.keystores.factory.class;
>         >     Ignoring.
>         >
>         >     2015-02-15 07:51:07,645 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter: hadoop.ssl.server.conf;  Ignoring.
>         >
>         >     2015-02-15 07:51:07,663 WARN [main]
>         >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>         >     override final parameter:
>         >     mapreduce.job.end-notification.max.attempts; Ignoring.
>         >
>         >     2015-02-15 07:51:08,237 WARN [main]
>         >     org.apache.hadoop.util.NativeCodeLoader: Unable to load
>         >     native-hadoop library for your platform... using
>         builtin-java
>         >     classes where applicable
>         >
>         >     2015-02-15 07:51:08,429 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter
>         >     set in config null
>         >
>         >     2015-02-15 07:51:08,499 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>         OutputCommitter is
>         > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
>         >
>         >     2015-02-15 07:51:08,526 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         > org.apache.hadoop.mapreduce.jobhistory.EventType for class
>         > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
>         >
>         >     2015-02-15 07:51:08,527 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         > org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for
>         >     class
>         >
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
>         >
>         >     2015-02-15 07:51:08,561 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         > org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for
>         >     class
>         >
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
>         >
>         >     2015-02-15 07:51:08,562 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         >
>         org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType
>         >     for class
>         >
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
>         >
>         >     2015-02-15 07:51:08,566 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for
>         >     class
>         org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
>         >
>         >     2015-02-15 07:51:08,568 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         >
>         org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType
>         >     for class
>         >
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
>         >
>         >     2015-02-15 07:51:08,568 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         >
>         org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType
>         >     for class
>         >
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
>         >
>         >     2015-02-15 07:51:08,570 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         >
>         org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType
>         >     for class
>         >
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
>         >
>         >     2015-02-15 07:51:08,599 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Recovery is
>         >     enabled. Will try to recover from previous life on best
>         effort basis.
>         >
>         >     2015-02-15 07:51:08,642 INFO [main]
>         > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history
>         >     file is at
>         >
>         hdfs://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist
>         >
>         <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>
>         <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>
>         >
>         >     _2015-02-15
>         >
>         <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>
>         <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>_07:51:09,147
>         >     INFO [main]
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Read
>         >     completed tasks from history 0
>         >
>         >     2015-02-15 07:51:09,193 INFO [main]
>         >     org.apache.hadoop.yarn.event.AsyncDispatcher:
>         Registering class
>         > org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type
>         >     for class
>         >
>         org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
>         >
>         >     2015-02-15 07:51:09,222 INFO [main]
>         > org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>         >     from hadoop-metrics2.properties
>         >
>         >     2015-02-15 07:51:09,277 INFO [main]
>


Mime
View raw message