hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Alten-Lorenz <wget.n...@gmail.com>
Subject Re: Yarn AM is abending job when submitting a remote job to cluster
Date Fri, 20 Feb 2015 07:11:56 GMT
when you search for your launched yarn app (1424003606313_0012) in the logs you’ll see:

Application appattempt_1424003606313_0012_000001 is done. finalState=FAILED
Application appattempt_1424003606313_0012_000002 is done. finalState=FAILED

hadoop1:
2015-02-19 19:55:50,671 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1424003606313_0012_01_000001 of capacity <memory:1024, vCores:1> on host hadoop0.rdpratti.com:8041, which has 1 containers, <memory:1024, vCores:1> used and <memory:433, vCores:1> available after allocation

yarn launches:
$JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA  -Djava.net.preferIPv4Stack=true -Xmx209715200 org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr 

Your app gets stored, submitted, launched and failed. Raise up memory, add some more vcores into the Vm and have a look at the Cloudera Manager, when you use the Cloudera VM. Should listen on port :7180

That has nothing to do with ssl or similar, since yarn uses secured tokens in an unsecured environment. 

BR,
 Alex


> On 20 Feb 2015, at 03:09, Roland DePratti <roland.depratti@cox.net> wrote:
> 
> Xuan,
>  
> Thanks for asking. Here is the RM log. It almost looks like the log completes successfully (see red highlighting).
>  
>  
>  
> 2015-02-19 19:55:43,315 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 12
> 2015-02-19 19:55:44,659 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 12 submitted by user cloudera
> 2015-02-19 19:55:44,659 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1424003606313_0012
> 2015-02-19 19:55:44,659 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera    IP=192.168.2.185    OPERATION=Submit Application Request    TARGET=ClientRMService    RESULT=SUCCESS    APPID=application_1424003606313_0012
> 2015-02-19 19:55:44,659 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1424003606313_0012 State change from NEW to NEW_SAVING
> 2015-02-19 19:55:44,659 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1424003606313_0012
> 2015-02-19 19:55:44,660 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1424003606313_0012 State change from NEW_SAVING to SUBMITTED
> 2015-02-19 19:55:44,666 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1424003606313_0012 from user: cloudera, in queue: default, currently num of applications: 1
> 2015-02-19 19:55:44,667 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1424003606313_0012 State change from SUBMITTED to ACCEPTED
> 2015-02-19 19:55:44,667 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:44,667 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000001 State change from NEW to SUBMITTED
> 2015-02-19 19:55:44,667 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1424003606313_0012_000001 to scheduler from user: cloudera
> 2015-02-19 19:55:44,669 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000001 State change from SUBMITTED to SCHEDULED
> 2015-02-19 19:55:50,671 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_01_000001 Container Transitioned from NEW to ALLOCATED
> 2015-02-19 19:55:50,671 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera    OPERATION=AM Allocated Container    TARGET=SchedulerApp    RESULT=SUCCESS    APPID=application_1424003606313_0012    CONTAINERID=container_1424003606313_0012_01_000001
> 2015-02-19 19:55:50,671 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1424003606313_0012_01_000001 of capacity <memory:1024, vCores:1> on host hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, which has 1 containers, <memory:1024, vCores:1> used and <memory:433, vCores:1> available after allocation
> 2015-02-19 19:55:50,672 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041 for container : container_1424003606313_0012_01_000001
> 2015-02-19 19:55:50,672 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_01_000001 Container Transitioned from ALLOCATED to ACQUIRED
> 2015-02-19 19:55:50,673 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,673 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1424003606313_0012 AttemptId: appattempt_1424003606313_0012_000001 MasterContainer: Container: [ContainerId: container_1424003606313_0012_01_000001, NodeId: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, NodeHttpAddress: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.2.253:8041 }, ]
> 2015-02-19 19:55:50,673 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000001 State change from SCHEDULED to ALLOCATED_SAVING
> 2015-02-19 19:55:50,673 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000001 State change from ALLOCATED_SAVING to ALLOCATED
> 2015-02-19 19:55:50,673 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,674 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1424003606313_0012_01_000001, NodeId: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, NodeHttpAddress: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.2.253:8041 }, ] for AM appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,675 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1424003606313_0012_01_000001 : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA  -Djava.net.preferIPv4Stack=true -Xmx209715200 org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr 
> 2015-02-19 19:55:50,675 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,675 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,688 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1424003606313_0012_01_000001, NodeId: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, NodeHttpAddress: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.2.253:8041 }, ] for AM appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:50,688 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000001 State change from ALLOCATED to LAUNCHED
> 2015-02-19 19:55:50,928 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_01_000001 Container Transitioned from ACQUIRED to RUNNING
> 2015-02-19 19:55:57,941 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_01_000001 Container Transitioned from RUNNING to COMPLETED
> 2015-02-19 19:55:57,941 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_1424003606313_0012_01_000001 in state: COMPLETED event:FINISHED
> 2015-02-19 19:55:57,942 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera    OPERATION=AM Released Container    TARGET=SchedulerApp    RESULT=SUCCESS    APPID=application_1424003606313_0012    CONTAINERID=container_1424003606313_0012_01_000001
> 2015-02-19 19:55:57,942 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1424003606313_0012_01_000001 of capacity <memory:1024, vCores:1> on host hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, which currently has 0 containers, <memory:0, vCores:0> used and <memory:1457, vCores:2> available, release resources=true
> 2015-02-19 19:55:57,942 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application attempt appattempt_1424003606313_0012_000001 released container container_1424003606313_0012_01_000001 on node: host: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041 #containers=0 available=1457 used=0 with event: FINISHED
> 2015-02-19 19:55:57,942 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1424003606313_0012_000001 with final state: FAILED, and exit status: 1
> 2015-02-19 19:55:57,942 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000001 State change from LAUNCHED to FINAL_SAVING
> 2015-02-19 19:55:57,942 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1424003606313_0012_000001
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000001 State change from FINAL_SAVING to FAILED
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application appattempt_1424003606313_0012_000001 is done. finalState=FAILED
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1424003606313_0012_000002
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1424003606313_0012 requests cleared
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000002 State change from NEW to SUBMITTED
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1424003606313_0012_000002 to scheduler from user: cloudera
> 2015-02-19 19:55:57,943 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000002 State change from SUBMITTED to SCHEDULED
> 2015-02-19 19:55:58,941 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
> 2015-02-19 19:56:03,950 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_02_000001 Container Transitioned from NEW to ALLOCATED
> 2015-02-19 19:56:03,950 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera    OPERATION=AM Allocated Container    TARGET=SchedulerApp    RESULT=SUCCESS    APPID=application_1424003606313_0012    CONTAINERID=container_1424003606313_0012_02_000001
> 2015-02-19 19:56:03,950 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1424003606313_0012_02_000001 of capacity <memory:1024, vCores:1> on host hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, which has 1 containers, <memory:1024, vCores:1> used and <memory:433, vCores:1> available after allocation
> 2015-02-19 19:56:03,950 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041 for container : container_1424003606313_0012_02_000001
> 2015-02-19 19:56:03,951 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_02_000001 Container Transitioned from ALLOCATED to ACQUIRED
> 2015-02-19 19:56:03,951 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,951 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1424003606313_0012 AttemptId: appattempt_1424003606313_0012_000002 MasterContainer: Container: [ContainerId: container_1424003606313_0012_02_000001, NodeId: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, NodeHttpAddress: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.2.253:8041 }, ]
> 2015-02-19 19:56:03,952 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000002 State change from SCHEDULED to ALLOCATED_SAVING
> 2015-02-19 19:56:03,952 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000002 State change from ALLOCATED_SAVING to ALLOCATED
> 2015-02-19 19:56:03,952 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,953 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1424003606313_0012_02_000001, NodeId: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, NodeHttpAddress: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.2.253:8041 }, ] for AM appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,953 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1424003606313_0012_02_000001 : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA  -Djava.net.preferIPv4Stack=true -Xmx209715200 org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr 
> 2015-02-19 19:56:03,953 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,953 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,974 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1424003606313_0012_02_000001, NodeId: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, NodeHttpAddress: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.2.253:8041 }, ] for AM appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:03,974 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000002 State change from ALLOCATED to LAUNCHED
> 2015-02-19 19:56:04,947 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_02_000001 Container Transitioned from ACQUIRED to RUNNING
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1424003606313_0012_02_000001 Container Transitioned from RUNNING to COMPLETED
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_1424003606313_0012_02_000001 in state: COMPLETED event:FINISHED
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera    OPERATION=AM Released Container    TARGET=SchedulerApp    RESULT=SUCCESS    APPID=application_1424003606313_0012    CONTAINERID=container_1424003606313_0012_02_000001
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1424003606313_0012_02_000001 of capacity <memory:1024, vCores:1> on host hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041, which currently has 0 containers, <memory:0, vCores:0> used and <memory:1457, vCores:2> available, release resources=true
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1424003606313_0012_000002 with final state: FAILED, and exit status: 1
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application attempt appattempt_1424003606313_0012_000002 released container container_1424003606313_0012_02_000001 on node: host: hadoop0.rdpratti.com <http://hadoop0.rdpratti.com/>:8041 #containers=0 available=1457 used=0 with event: FINISHED
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000002 State change from LAUNCHED to FINAL_SAVING
> 2015-02-19 19:56:10,956 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:10,957 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:10,957 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1424003606313_0012_000002 State change from FINAL_SAVING to FAILED
> 2015-02-19 19:56:10,957 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1424003606313_0012 with final state: FAILED
> 2015-02-19 19:56:10,957 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1424003606313_0012 State change from ACCEPTED to FINAL_SAVING
> 2015-02-19 19:56:10,957 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1424003606313_0012
> 2015-02-19 19:56:10,957 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application appattempt_1424003606313_0012_000002 is done. finalState=FAILED
> 2015-02-19 19:56:10,957 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1424003606313_0012 requests cleared
> 2015-02-19 19:56:10,990 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1424003606313_0012 failed 2 times due to AM Container for appattempt_1424003606313_0012_000002 exited with  exitCode: 1 due to: Exception from container-launch.
> Container id: container_1424003606313_0012_02_000001
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>     at org.apache.hadoop.util.Shell.run(Shell.java:455)
>     at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>     at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 
> 
> Container exited with a non-zero exit code 1
> .Failing this attempt.. Failing the application.
> 2015-02-19 19:56:10,990 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1424003606313_0012 State change from FINAL_SAVING to FAILED
> 2015-02-19 19:56:10,991 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=cloudera    OPERATION=Application Finished - Failed    TARGET=RMAppManager    RESULT=FAILURE    DESCRIPTION=App failed with state: FAILED    PERMISSIONS=Application application_1424003606313_0012 failed 2 times due to AM Container for appattempt_1424003606313_0012_000002 exited with  exitCode: 1 due to: Exception from container-launch.
> Container id: container_1424003606313_0012_02_000001
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>     at org.apache.hadoop.util.Shell.run(Shell.java:455)
>     at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>     at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 
> 
>  
>  
> From: Xuan Gong [mailto:xgong@hortonworks.com] 
> Sent: Thursday, February 19, 2015 8:23 PM
> To: user@hadoop.apache.org
> Subject: Re: Yarn AM is abending job when submitting a remote job to cluster
>  
> Hey, Roland:
>     Could you also check the RM logs for this application, please ? Maybe we could find something there.
>  
> Thanks
>  
> Xuan Gong
>  
> From: Roland DePratti <roland.depratti@cox.net <mailto:roland.depratti@cox.net>>
> Reply-To: "user@hadoop.apache.org <mailto:user@hadoop.apache.org>" <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> Date: Thursday, February 19, 2015 at 5:11 PM
> To: "user@hadoop.apache.org <mailto:user@hadoop.apache.org>" <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> Subject: RE: Yarn AM is abending job when submitting a remote job to cluster
>  
> No, I hear you.  
>  
> I was just stating that the fact that hdfs works, there is something right about the connectivity, that’s all, i.e. Server is reachable, hadoop was able to process the request – but like you said, doesn’t mean yarn works.
>  
> I tried both your solution and Alex’s solution unfortunately without any improvement.
>  
> Here is the command I am executing:
>  
> hadoop jar avgWordlength.jar  solution.AvgWordLength -conf ~/conf/hadoop-cluster.xml /user/cloudera/shakespeare wordlength4
>  
> Here is the new hadoop-cluseter.xml
>  
> <?xml version="1.0" encoding="UTF-8"?>
> 
> <!--generated by Roland-->
> <configuration>
>   <property>
>     <name>fs.defaultFS</name>
>     <value>hdfs://hadoop0.rdpratti.com:8020</value>
>   </property>
>   <property>
>     <name>mapreduce.jobtracker.address</name>
>     <value>hadoop0.rdpratti.com:8032</value>
>   </property>
>   <property>
>     <name>yarn.resourcemanager.address</name>
>     <value>hadoop0.rdpratti.com:8032</value>
>   </property>
> 
> 
> I also deleted the .staging directory under the submitting user. Plus restarted Job History Server. 
>  
> Resubmitted the job with the same result. Here is the log:
>  
> 2015-02-19 19:56:05,061 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1424003606313_0012_000002
> 2015-02-19 19:56:05,468 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
> 2015-02-19 19:56:05,471 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2015-02-19 19:56:05,471 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
> 2015-02-19 19:56:05,473 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
> 2015-02-19 19:56:05,476 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
> 2015-02-19 19:56:05,490 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2015-02-19 19:56:05,621 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2015-02-19 19:56:05,621 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@3909f88f <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@3909f88f>)
> 2015-02-19 19:56:05,684 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
> 2015-02-19 19:56:05,923 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
> 2015-02-19 19:56:05,925 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2015-02-19 19:56:05,929 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
> 2015-02-19 19:56:05,930 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
> 2015-02-19 19:56:05,934 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
> 2015-02-19 19:56:05,958 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2015-02-19 19:56:06,529 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2015-02-19 19:56:06,719 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
> 2015-02-19 19:56:06,837 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
> 2015-02-19 19:56:06,881 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
> 2015-02-19 19:56:06,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
> 2015-02-19 19:56:06,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
> 2015-02-19 19:56:06,883 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
> 2015-02-19 19:56:06,884 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2015-02-19 19:56:06,885 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
> 2015-02-19 19:56:06,885 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
> 2015-02-19 19:56:06,886 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
> 2015-02-19 19:56:06,899 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Recovery is enabled. Will try to recover from previous life on best effort basis.
> 2015-02-19 19:56:06,918 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history file is at hdfs://hadoop0.rdpratti.com:8020/user/cloudera/.staging/job_1424003606313_0012/job_1424003606313_0012_1.jhist
> 2015-02-19 19:56:07,377 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Read completed tasks from history 0
> 2015-02-19 19:56:07,423 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
> 2015-02-19 19:56:07,453 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2015-02-19 19:56:07,507 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2015-02-19 19:56:07,507 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
> 2015-02-19 19:56:07,515 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1424003606313_0012 to jobTokenSecretManager
> 2015-02-19 19:56:07,536 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1424003606313_0012 because: not enabled; too much RAM;
> 2015-02-19 19:56:07,555 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1424003606313_0012 = 5343207. Number of splits = 5
> 2015-02-19 19:56:07,557 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1424003606313_0012 = 1
> 2015-02-19 19:56:07,557 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1424003606313_0012Job Transitioned from NEW to INITED
> 2015-02-19 19:56:07,558 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1424003606313_0012.
> 2015-02-19 19:56:07,618 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
> 2015-02-19 19:56:07,630 INFO [Socket Reader #1 for port 46841] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 46841
> 2015-02-19 19:56:07,648 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
> 2015-02-19 19:56:07,648 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2015-02-19 19:56:07,649 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at hadoop0.rdpratti.com/192.168.2.253:46841
> 2015-02-19 19:56:07,650 INFO [IPC Server listener on 46841] org.apache.hadoop.ipc.Server: IPC Server listener on 46841: starting
> 2015-02-19 19:56:07,721 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
> 2015-02-19 19:56:07,727 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
> 2015-02-19 19:56:07,739 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
> 2015-02-19 19:56:07,745 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
> 2015-02-19 19:56:07,745 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
> 2015-02-19 19:56:07,749 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
> 2015-02-19 19:56:07,749 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
> 2015-02-19 19:56:07,760 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 39939
> 2015-02-19 19:56:07,760 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
> 2015-02-19 19:56:07,789 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/jars/hadoop-yarn-common-2.5.0-cdh5.3.0.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_39939_mapreduce____.o5qk0w/webapp
> 2015-02-19 19:56:08,156 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:39939 <mailto:HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:39939>
> 2015-02-19 19:56:08,157 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 39939
> 2015-02-19 19:56:08,629 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
> 2015-02-19 19:56:08,634 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
> 2015-02-19 19:56:08,635 INFO [Socket Reader #1 for port 43858] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 43858
> 2015-02-19 19:56:08,639 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2015-02-19 19:56:08,642 INFO [IPC Server listener on 43858] org.apache.hadoop.ipc.Server: IPC Server listener on 43858: starting
> 2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
> 2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
> 2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
> 2015-02-19 19:56:08,797 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
> 2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
> 2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
> 2015-02-19 19:56:08,799 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
> 2015-02-19 19:56:08,809 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2015-02-19 19:56:08,821 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/192.168.2.185:8030
> 2015-02-19 19:56:08,975 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> 2015-02-19 19:56:08,976 WARN [main] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> 2015-02-19 19:56:08,976 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> 2015-02-19 19:56:08,981 ERROR [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while registering
> org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>         at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
>         at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy36.registerApplicationMaster(Unknown Source)
>         at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:161)
>         at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
>         at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:238)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:807)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1075)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1478)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1474)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>         at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy35.registerApplicationMaster(Unknown Source)
>         at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
>         ... 22 more
> 2015-02-19 19:56:08,983 INFO [main] org.apache.hadoop.service.AbstractService: Service RMCommunicator failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>  
>  
>  
>  
> From: Ulul [mailto:hadoop@ulul.org <mailto:hadoop@ulul.org>] 
> Sent: Thursday, February 19, 2015 5:08 PM
> To: user@hadoop.apache.org <mailto:user@hadoop.apache.org>
> Subject: Re: Yarn AM is abending job when submitting a remote job to cluster
>  
> Is your point is that using the hdfs:// prefix is valid since our hdfs client works ?
> fs.defaultFS defines the namenode address and the filesystem type. It doen't imply that the prefix should be used for yarn and mapreduce options that are not directly linked to hdfs 
> 
> 
> 
> Le 19/02/2015 22:56, Ulul a écrit :
>> In that case it's just between your hdfs client, the NN and the DNs, no YARN or MR component involved.
>> The fact that this works is not related to your MR job not succeeding.
>> 
>> 
>> 
>> Le 19/02/2015 22:45, roland.depratti a écrit :
>>> Thanks for looking at my problem.
>>>  
>>> I can run an hdfs command from the client, with the config file listed, that does a cat on a file in hdfs on the remote cluster and returns the contents of that file to the client.
>>>  
>>> - rd
>>>  
>>>  
>>> Sent from my Verizon Wireless 4G LTE smartphone
>>> 
>>> 
>>> -------- Original message --------
>>> From: Ulul <hadoop@ulul.org> <mailto:hadoop@ulul.org> 
>>> Date:02/19/2015 4:03 PM (GMT-05:00) 
>>> To: user@hadoop.apache.org <mailto:user@hadoop.apache.org> 
>>> Subject: Re: Yarn AM is abending job when submitting a remote job to cluster 
>>> 
>>> Hi
>>> Doesn't seem like an ssl error to me (the log states that attempts to 
>>> override final properties are ignored)
>>> 
>>> On the other hand the configuration seems wrong 
>>> :mapreduce.jobtracker.address and yarn.resourcemanager.address should 
>>> only contain an IP or a hostname. You should remove 'hdfs://' though the 
>>> log doesn't suggest it has anything to do with your problem....
>>> 
>>> And what do you mean by an "HDFS job" ?
>>> 
>>> Ulul
>>> 
>>> Le 19/02/2015 04:22, daemeon reiydelle a écrit :
>>> > I would guess you do not have your ssl certs set up, client or server, 
>>> > based on the error.
>>> >
>>> > ***
>>> > .......
>>> > ***“Life should not be a journey to the grave with the intention of 
>>> > arriving safely in a
>>> > pretty and well preserved body, but rather to skid in broadside in a 
>>> > cloud of smoke,
>>> > thoroughly used up, totally worn out, and loudly proclaiming “Wow! 
>>> > What a Ride!”*
>>> > - Hunter Thompson
>>> >
>>> > Daemeon C.M. Reiydelle
>>> > USA (+1) 415.501.0198
>>> > London (+44) (0) 20 8144 9872*/
>>> > /
>>> >
>>> > On Wed, Feb 18, 2015 at 5:19 PM, Roland DePratti 
>>> > <roland.depratti@cox.net <mailto:roland.depratti@cox.net> <mailto:roland.depratti@cox.net> <mailto:roland.depratti@cox.net>> wrote:
>>> >
>>> >     I have been searching for a handle on a problem without very
>>> >     little clues. Any help pointing me to the right direction will be
>>> >     huge.
>>> >
>>> >     I have not received any input form the Cloudera google groups.
>>> >     Perhaps this is more Yarn based and I am hoping I have more luck here.
>>> >
>>> >     Any help is greatly appreciated.
>>> >
>>> >     I am running a Hadoop cluster using CDH5.3. I also have a client
>>> >     machine with a standalone one node setup (VM).
>>> >
>>> >     All environments are running CentOS 6.6.
>>> >
>>> >     I have submitted some Java mapreduce jobs locally on both the
>>> >     cluster and the standalone environment with successfully completions.
>>> >
>>> >     I can submit a remote HDFS job from client to cluster using -conf
>>> >     hadoop-cluster.xml (see below) and get data back from the cluster
>>> >     with no problem.
>>> >
>>> >     When submitted remotely the mapreduce jobs remotely, I get an AM
>>> >     error:
>>> >
>>> >     AM fails the job with the error:
>>> >
>>> >
>>> >                SecretManager$InvalidToken:
>>> >     appattempt_1424003606313_0001_000002 not found in
>>> >     AMRMTokenSecretManager
>>> >
>>> >
>>> >     I searched /var/log/secure on the client and cluster with no
>>> >     unusual messages.
>>> >
>>> >     Here is the contents of hadoop-cluster.xml:
>>> >
>>> >     <?xml version="1.0" encoding="UTF-8"?>
>>> >
>>> >     <!--generated by Roland-->
>>> >     <configuration>
>>> >       <property>
>>> >         <name>fs.defaultFS</name>
>>> >         <value>hdfs://mycluser:8020</value>
>>> >       </property>
>>> >       <property>
>>> >     <name>mapreduce.jobtracker.address</name>
>>> >         <value>hdfs://mycluster:8032</value>
>>> >       </property>
>>> >       <property>
>>> >     <name>yarn.resourcemanager.address</name>
>>> >         <value>hdfs://mycluster:8032</value>
>>> >       </property>
>>> >
>>> >     Here is the output from the job log on the cluster:
>>> >
>>> >     2015-02-15 07:51:06,544 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created
>>> >     MRAppMaster for application appattempt_1424003606313_0001_000002
>>> >
>>> >     2015-02-15 07:51:06,949 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
>>> >
>>> >     2015-02-15 07:51:06,952 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter:
>>> >     mapreduce.job.end-notification.max.retry.interval; Ignoring.
>>> >
>>> >     2015-02-15 07:51:06,952 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.client.conf;  Ignoring.
>>> >
>>> >     2015-02-15 07:51:06,954 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.keystores.factory.class; 
>>> >     Ignoring.
>>> >
>>> >     2015-02-15 07:51:06,957 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.server.conf;  Ignoring.
>>> >
>>> >     2015-02-15 07:51:06,973 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter:
>>> >     mapreduce.job.end-notification.max.attempts; Ignoring.
>>> >
>>> >     2015-02-15 07:51:07,241 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
>>> >
>>> >     2015-02-15 07:51:07,241 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
>>> >     YARN_AM_RM_TOKEN, Service: , Ident:
>>> >     (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0 <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0>)
>>> >
>>> >     2015-02-15 07:51:07,332 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred
>>> >     newApiCommitter.
>>> >
>>> >     2015-02-15 07:51:07,627 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
>>> >
>>> >     2015-02-15 07:51:07,632 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter:
>>> >     mapreduce.job.end-notification.max.retry.interval; Ignoring.
>>> >
>>> >     2015-02-15 07:51:07,632 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.client.conf;  Ignoring.
>>> >
>>> >     2015-02-15 07:51:07,639 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.keystores.factory.class; 
>>> >     Ignoring.
>>> >
>>> >     2015-02-15 07:51:07,645 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter: hadoop.ssl.server.conf;  Ignoring.
>>> >
>>> >     2015-02-15 07:51:07,663 WARN [main]
>>> >     org.apache.hadoop.conf.Configuration: job.xml:an attempt to
>>> >     override final parameter:
>>> >     mapreduce.job.end-notification.max.attempts; Ignoring.
>>> >
>>> >     2015-02-15 07:51:08,237 WARN [main]
>>> >     org.apache.hadoop.util.NativeCodeLoader: Unable to load
>>> >     native-hadoop library for your platform... using builtin-java
>>> >     classes where applicable
>>> >
>>> >     2015-02-15 07:51:08,429 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter
>>> >     set in config null
>>> >
>>> >     2015-02-15 07:51:08,499 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is
>>> >     org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
>>> >
>>> >     2015-02-15 07:51:08,526 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.jobhistory.EventType for class
>>> >     org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
>>> >
>>> >     2015-02-15 07:51:08,527 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for
>>> >     class
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
>>> >
>>> >     2015-02-15 07:51:08,561 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for
>>> >     class
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
>>> >
>>> >     2015-02-15 07:51:08,562 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType
>>> >     for class
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
>>> >
>>> >     2015-02-15 07:51:08,566 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for
>>> >     class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
>>> >
>>> >     2015-02-15 07:51:08,568 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType
>>> >     for class
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
>>> >
>>> >     2015-02-15 07:51:08,568 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType
>>> >     for class
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
>>> >
>>> >     2015-02-15 07:51:08,570 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType
>>> >     for class
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
>>> >
>>> >     2015-02-15 07:51:08,599 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Recovery is
>>> >     enabled. Will try to recover from previous life on best effort basis.
>>> >
>>> >     2015-02-15 07:51:08,642 INFO [main]
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history
>>> >     file is at
>>> >     hdfs://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist
>>> >     <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15> <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>
>>> >
>>> >     _2015-02-15
>>> >     <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15> <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>_07:51:09,147
>>> >     INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Read
>>> >     completed tasks from history 0
>>> >
>>> >     2015-02-15 07:51:09,193 INFO [main]
>>> >     org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class
>>> >     org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type
>>> >     for class
>>> >     org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
>>> >
>>> >     2015-02-15 07:51:09,222 INFO [main]
>>> >     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>> >     from hadoop-metrics2.properties
>>> >
>>> >     2015-02-15 07:51:09,277 INFO [main]


Mime
View raw message