hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulul <had...@ulul.org>
Subject Re: Yarn AM is abending job when submitting a remote job to cluster
Date Sun, 22 Feb 2015 18:45:29 GMT
OK, your pointing me to the definitive guide code made me realize that I 
couldn't use -conf because my driver didn't implement Tool.run. Now I 
can use that option, sorry for being misleading

Nevertheless I now run into the jar distribution problem (No job jar 
file set). I'll try mapred job instead of hadoop jar

On your side could you try one dir with just two overrides (this is how 
I had it working)

So please create a conf dir and two files
conf/yarn-site.xml containing
<configuration>
     <property>
         <name>yarn.resourcemanager.address</name>
         <value>hadoop0.rdpratti.com:8032</value>
     </property>
</configuration>

and
conf/core-site.xml containing
<configuration>
     <property>
         <name>fs.defaultFS</name>
<value>hdfs://hadoop0.rdpratti.com:8020</value>
     </property>
</configuration>

and launch with
hadoop --config ./conf/ jar <jar name> <class name> <options>

And then pastebin the ob output

Ulul
Le 22/02/2015 15:31, Roland DePratti a écrit :
>
> Ulul,
>
> I will try your recommendation and see what happens.
>
> My prototype came from examples in Tom White’s Hadoop Definitive guide 
> book Chapter 5.  And both logs where extracted from the cluster, but I 
> understand that even though they started there does not mean that they 
> couldn’t be routed elsewhere – that IP address also made me wonder.
>
> Without the -conf option, the job runs successfully on my client machine.
>
> Let me ask you about the directory you talked about.
>
> -Should I include in that local directory the files from the server 
> Yarn config zip file or just my overrides?
>
> -I am still learning how they work together.
>
> Thanks
>
> -rd
>
> *From:*Ulul [mailto:hadoop@ulul.org]
> *Sent:* Sunday, February 22, 2015 8:33 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Yarn AM is abending job when submitting a remote job to 
> cluster
>
> Roland,
>
> One thing I don't get is how your command works : hadoop jar has no 
> support for generic options such as -conf, but hadoop provides --config :
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html#Overview
>
> So my understanding is that your remote connection from the VM 
> actually stays local. This impression is reinforced by the following 
> log lines from which you can see that ipc client was connected on IP 
> ...185 while successful attempt is connected on ...253
>
> failed attempt log :
> 2015-02-21 17:54:59,107 DEBUG [main] org.apache.hadoop.ipc.Client: 
> closing ipc connection to quickstart.cloudera/192.168.2.185:8030: 
> appattempt_1424550134651_0001_000001 not found in AMRMTokenSecretManager.
>
> successful attempt log :
> 2015-02-21 19:01:19,371 DEBUG [IPC Client (1541092382) connection to 
> hadoop0.rdpratti.com/192.168.2.253:8030 from cloudera] 
> org.apache.hadoop.ipc.Client: IPC Client (1541092382) connection to 
> hadoop0.rdpratti.com/192.168.2.253:8030 from cloudera got value #10
>
> I think you ran 2 local jobs, one on VM, one on cluster, and that 
> there is something wrong in your VM configuration
>
> You should try to run a local job on VM and see what happens and try a 
> remote job from VM to cluster with --config (it points to a directory 
> into which you need to split your cluster-conf.xml in yarn-site.xml, 
> core-site.xml...)
>
> Ulul
>
> Le 22/02/2015 13:42, Roland DePratti a écrit :
>
>     Ulul,
>
>     I appreciate your help and trying my use case.  I think I have a
>     lot of good details for you.
>
>     Here is my commands:
>
>     hadoop jar avgwordlength.jar solution.AvgWordLength -conf
>     ~/conf/hadoop-cluster.xml /user/cloudera/shakespeare wordlengths7.
>
>     Since my last email, I examined the syslogs ( I ran both jobs with
>     debug turned on) for both the remote abend and the local
>     successful run on the cluster server.
>
>     I have attached both logs, plus a file where I posted my manual
>     comparison findings, and the config xml file
>
>     Briefly, here is what I found (more details in Comparison Log w/
>     Notes file):
>
>     1.Both logs follow the same steps with same outcome from the
>     beginning to line 1590.
>
>     2.At line 1590 both logs record a AMRMTokenSelector Looking for
>     Token with service
>
>     -The successful job does this on the cluster server
>     (192.168.2.253) since it was run locally.
>
>     -The abending job does this on the client vm (192.168.2.185)
>
>     3.After that point the logs are not the same until JobHistory kicks in
>
>     -The abending log spends a lot of time trying to handle the error
>
>     -The successful job begins processing the job.
>
>     oAt line 1615 it setup the queue (root.cloudera)
>
>     oAt line 1651 JOB_SETUP_Complete is reported.
>
>     oBoth of these messages do not appear in the abended log.
>
>     My guess is this a setup problem that I produced – I just can’t
>     find it.
>
>     - rd
>
>     *From:*Ulul [mailto:hadoop@ulul.org]
>     *Sent:* Saturday, February 21, 2015 9:50 PM
>     *To:* user@hadoop.apache.org <mailto:user@hadoop.apache.org>
>     *Subject:* Re: Yarn AM is abending job when submitting a remote
>     job to cluster
>
>     Hi Roland
>
>     I tried to reproduce your problem with a single node setup
>     submitting a job to a remote cluster (please note I'm an HDP user,
>     it's a sandbox submitting to a 3 VMs cluster)
>     It worked like a charm...
>     I run into problems when submitting the job from another user but
>     with a permission problem, it does not look like your AMRMToken
>     problem.
>
>     We are probably submitting our jobs differently though. I use
>     hadoop jar --config <conf dir>, you seem to be using something
>     different since you have the -conf generic option
>
>     Would you please share your job command ?
>
>     Ulul
>
>     Le 20/02/2015 03:09, Roland DePratti a écrit :
>
>         Xuan,
>
>         Thanks for asking. Here is the RM log. It almost looks like
>         the log completes successfully (see red highlighting).
>
>         2015-02-19 19:55:43,315 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.ClientRMService:
>         Allocated new applicationId: 12
>         2015-02-19 19:55:44,659 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.ClientRMService:
>         Application with id 12 submitted by user cloudera
>         2015-02-19 19:55:44,659 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         Storing application with id application_1424003606313_0012
>         2015-02-19 19:55:44,659 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
>         USER=cloudera    IP=192.168.2.185    OPERATION=Submit
>         Application Request    TARGET=ClientRMService
>         RESULT=SUCCESS    APPID=application_1424003606313_0012
>         2015-02-19 19:55:44,659 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         application_1424003606313_0012 State change from NEW to NEW_SAVING
>         2015-02-19 19:55:44,659 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
>         Storing info for app: application_1424003606313_0012
>         2015-02-19 19:55:44,660 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         application_1424003606313_0012 State change from NEW_SAVING to
>         SUBMITTED
>         2015-02-19 19:55:44,666 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Accepted application application_1424003606313_0012 from user:
>         cloudera, in queue: default, currently num of applications: 1
>         2015-02-19 19:55:44,667 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         application_1424003606313_0012 State change from SUBMITTED to
>         ACCEPTED
>         2015-02-19 19:55:44,667 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
>         Registering app attempt : appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:44,667 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000001 State change from NEW to
>         SUBMITTED
>         2015-02-19 19:55:44,667 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Added Application Attempt appattempt_1424003606313_0012_000001
>         to scheduler from user: cloudera
>         2015-02-19 19:55:44,669 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000001 State change from
>         SUBMITTED to SCHEDULED
>         2015-02-19 19:55:50,671 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_01_000001 Container Transitioned
>         from NEW to ALLOCATED
>         2015-02-19 19:55:50,671 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
>         USER=cloudera    OPERATION=AM Allocated Container
>         TARGET=SchedulerApp    RESULT=SUCCESS
>         APPID=application_1424003606313_0012
>         CONTAINERID=container_1424003606313_0012_01_000001
>         2015-02-19 19:55:50,671 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
>         Assigned container container_1424003606313_0012_01_000001 of
>         capacity <memory:1024, vCores:1> on host
>         hadoop0.rdpratti.com:8041, which has 1 containers,
>         <memory:1024, vCores:1> used and <memory:433, vCores:1>
>         available after allocation
>         2015-02-19 19:55:50,672 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
>         Sending NMToken for nodeId : hadoop0.rdpratti.com:8041 for
>         container : container_1424003606313_0012_01_000001
>         2015-02-19 19:55:50,672 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_01_000001 Container Transitioned
>         from ALLOCATED to ACQUIRED
>         2015-02-19 19:55:50,673 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
>         Clear node set for appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:50,673 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         Storing attempt: AppId: application_1424003606313_0012
>         AttemptId: appattempt_1424003606313_0012_000001
>         MasterContainer: Container: [ContainerId:
>         container_1424003606313_0012_01_000001, NodeId:
>         hadoop0.rdpratti.com:8041, NodeHttpAddress:
>         hadoop0.rdpratti.com:8042, Resource: <memory:1024, vCores:1>,
>         Priority: 0, Token: Token { kind: ContainerToken, service:
>         192.168.2.253:8041 }, ]
>         2015-02-19 19:55:50,673 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000001 State change from
>         SCHEDULED to ALLOCATED_SAVING
>         2015-02-19 19:55:50,673 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000001 State change from
>         ALLOCATED_SAVING to ALLOCATED
>         2015-02-19 19:55:50,673 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Launching masterappattempt_1424003606313_0012_000001
>         2015-02-19 19:55:50,674 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Setting up container Container: [ContainerId:
>         container_1424003606313_0012_01_000001, NodeId:
>         hadoop0.rdpratti.com:8041, NodeHttpAddress:
>         hadoop0.rdpratti.com:8042, Resource: <memory:1024, vCores:1>,
>         Priority: 0, Token: Token { kind: ContainerToken, service:
>         192.168.2.253:8041 }, ] for AM
>         appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:50,675 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Command to launch container
>         container_1424003606313_0012_01_000001 : $JAVA_HOME/bin/java
>         -Dlog4j.configuration=container-log4j.properties
>         -Dyarn.app.container.log.dir=<LOG_DIR>
>         -Dyarn.app.container.log.filesize=0
>         -Dhadoop.root.logger=INFO,CLA -Djava.net.preferIPv4Stack=true
>         -Xmx209715200 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
>         1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
>         2015-02-19 19:55:50,675 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
>         Create AMRMToken for ApplicationAttempt:
>         appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:50,675 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
>         Creating password for appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:50,688 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Done launching container Container: [ContainerId:
>         container_1424003606313_0012_01_000001, NodeId:
>         hadoop0.rdpratti.com:8041, NodeHttpAddress:
>         hadoop0.rdpratti.com:8042, Resource: <memory:1024, vCores:1>,
>         Priority: 0, Token: Token { kind: ContainerToken, service:
>         192.168.2.253:8041 }, ] for AM
>         appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:50,688 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000001 State change from
>         ALLOCATED to LAUNCHED
>         2015-02-19 19:55:50,928 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_01_000001 Container Transitioned
>         from ACQUIRED to RUNNING
>         2015-02-19 19:55:57,941 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_01_000001 Container Transitioned
>         from RUNNING to COMPLETED
>         2015-02-19 19:55:57,941 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt:
>         Completed container: container_1424003606313_0012_01_000001 in
>         state: COMPLETED event:FINISHED
>         2015-02-19 19:55:57,942 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
>         USER=cloudera    OPERATION=AM Released Container
>         TARGET=SchedulerApp    RESULT=SUCCESS
>         APPID=application_1424003606313_0012
>         CONTAINERID=container_1424003606313_0012_01_000001
>         2015-02-19 19:55:57,942 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
>         Released container container_1424003606313_0012_01_000001 of
>         capacity <memory:1024, vCores:1> on host
>         hadoop0.rdpratti.com:8041, which currently has 0 containers,
>         <memory:0, vCores:0> used and <memory:1457, vCores:2>
>         available, release resources=true
>         2015-02-19 19:55:57,942 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Application attempt appattempt_1424003606313_0012_000001
>         released container container_1424003606313_0012_01_000001 on
>         node: host: hadoop0.rdpratti.com:8041 #containers=0
>         available=1457 used=0 with event: FINISHED
>         2015-02-19 19:55:57,942 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         Updating application attempt
>         appattempt_1424003606313_0012_000001 with final state: FAILED,
>         and exit status: 1
>         2015-02-19 19:55:57,942 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000001 State change from
>         LAUNCHED to FINAL_SAVING
>         2015-02-19 19:55:57,942 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
>         Unregistering app attempt : appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
>         Application finished, removing password for
>         appattempt_1424003606313_0012_000001
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000001 State change from
>         FINAL_SAVING to FAILED
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Application appattempt_1424003606313_0012_000001 is done.
>         finalState=FAILED
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
>         Registering app attempt : appattempt_1424003606313_0012_000002
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo:
>         Application application_1424003606313_0012 requests cleared
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000002 State change from NEW to
>         SUBMITTED
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Added Application Attempt appattempt_1424003606313_0012_000002
>         to scheduler from user: cloudera
>         2015-02-19 19:55:57,943 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000002 State change from
>         SUBMITTED to SCHEDULED
>         2015-02-19 19:55:58,941 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Null container completed...
>         2015-02-19 19:56:03,950 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_02_000001 Container Transitioned
>         from NEW to ALLOCATED
>         2015-02-19 19:56:03,950 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
>         USER=cloudera    OPERATION=AM Allocated Container
>         TARGET=SchedulerApp    RESULT=SUCCESS
>         APPID=application_1424003606313_0012
>         CONTAINERID=container_1424003606313_0012_02_000001
>         2015-02-19 19:56:03,950 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
>         Assigned container container_1424003606313_0012_02_000001 of
>         capacity <memory:1024, vCores:1> on host
>         hadoop0.rdpratti.com:8041, which has 1 containers,
>         <memory:1024, vCores:1> used and <memory:433, vCores:1>
>         available after allocation
>         2015-02-19 19:56:03,950 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
>         Sending NMToken for nodeId : hadoop0.rdpratti.com:8041 for
>         container : container_1424003606313_0012_02_000001
>         2015-02-19 19:56:03,951 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_02_000001 Container Transitioned
>         from ALLOCATED to ACQUIRED
>         2015-02-19 19:56:03,951 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
>         Clear node set for appattempt_1424003606313_0012_000002
>         2015-02-19 19:56:03,951 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         Storing attempt: AppId: application_1424003606313_0012
>         AttemptId: appattempt_1424003606313_0012_000002
>         MasterContainer: Container: [ContainerId:
>         container_1424003606313_0012_02_000001, NodeId:
>         hadoop0.rdpratti.com:8041, NodeHttpAddress:
>         hadoop0.rdpratti.com:8042, Resource: <memory:1024, vCores:1>,
>         Priority: 0, Token: Token { kind: ContainerToken, service:
>         192.168.2.253:8041 }, ]
>         2015-02-19 19:56:03,952 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000002 State change from
>         SCHEDULED to ALLOCATED_SAVING
>         2015-02-19 19:56:03,952 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000002 State change from
>         ALLOCATED_SAVING to ALLOCATED
>         2015-02-19 19:56:03,952 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Launching masterappattempt_1424003606313_0012_000002
>         2015-02-19 19:56:03,953 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Setting up container Container: [ContainerId:
>         container_1424003606313_0012_02_000001, NodeId:
>         hadoop0.rdpratti.com:8041, NodeHttpAddress:
>         hadoop0.rdpratti.com:8042, Resource: <memory:1024, vCores:1>,
>         Priority: 0, Token: Token { kind: ContainerToken, service:
>         192.168.2.253:8041 }, ] for AM
>         appattempt_1424003606313_0012_000002
>         2015-02-19 19:56:03,953 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Command to launch container
>         container_1424003606313_0012_02_000001 : $JAVA_HOME/bin/java
>         -Dlog4j.configuration=container-log4j.properties
>         -Dyarn.app.container.log.dir=<LOG_DIR>
>         -Dyarn.app.container.log.filesize=0
>         -Dhadoop.root.logger=INFO,CLA -Djava.net.preferIPv4Stack=true
>         -Xmx209715200 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
>         1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
>         2015-02-19 19:56:03,953 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
>         Create AMRMToken for ApplicationAttempt:
>         appattempt_1424003606313_0012_000002
>         2015-02-19 19:56:03,953 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
>         Creating password for appattempt_1424003606313_0012_000002
>         2015-02-19 19:56:03,974 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>         Done launching container Container: [ContainerId:
>         container_1424003606313_0012_02_000001, NodeId:
>         hadoop0.rdpratti.com:8041, NodeHttpAddress:
>         hadoop0.rdpratti.com:8042, Resource: <memory:1024, vCores:1>,
>         Priority: 0, Token: Token { kind: ContainerToken, service:
>         192.168.2.253:8041 }, ] for AM
>         appattempt_1424003606313_0012_000002
>         2015-02-19 19:56:03,974 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000002 State change from
>         ALLOCATED to LAUNCHED
>         2015-02-19 19:56:04,947 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_02_000001 Container Transitioned
>         from ACQUIRED to RUNNING
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>         container_1424003606313_0012_02_000001 Container Transitioned
>         from RUNNING to COMPLETED
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt:
>         Completed container: container_1424003606313_0012_02_000001 in
>         state: COMPLETED event:FINISHED
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
>         USER=cloudera    OPERATION=AM Released Container
>         TARGET=SchedulerApp    RESULT=SUCCESS
>         APPID=application_1424003606313_0012
>         CONTAINERID=container_1424003606313_0012_02_000001
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
>         Released container container_1424003606313_0012_02_000001 of
>         capacity <memory:1024, vCores:1> on host
>         hadoop0.rdpratti.com:8041, which currently has 0 containers,
>         <memory:0, vCores:0> used and <memory:1457, vCores:2>
>         available, release resources=true
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         Updating application attempt
>         appattempt_1424003606313_0012_000002 with final state: FAILED,
>         and exit status: 1
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Application attempt appattempt_1424003606313_0012_000002
>         released container container_1424003606313_0012_02_000001 on
>         node: host: hadoop0.rdpratti.com:8041 #containers=0
>         available=1457 used=0 with event: FINISHED
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000002 State change from
>         LAUNCHED to FINAL_SAVING
>         2015-02-19 19:56:10,956 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
>         Unregistering app attempt : appattempt_1424003606313_0012_000002
>         2015-02-19 19:56:10,957 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
>         Application finished, removing password for
>         appattempt_1424003606313_0012_000002
>         2015-02-19 19:56:10,957 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>         appattempt_1424003606313_0012_000002 State change from
>         FINAL_SAVING to FAILED
>         2015-02-19 19:56:10,957 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         Updating application application_1424003606313_0012 with final
>         state: FAILED
>         2015-02-19 19:56:10,957 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         application_1424003606313_0012 State change from ACCEPTED to
>         FINAL_SAVING
>         2015-02-19 19:56:10,957 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
>         Updating info for app: application_1424003606313_0012
>         2015-02-19 19:56:10,957 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
>         Application appattempt_1424003606313_0012_000002 is done.
>         finalState=FAILED
>         2015-02-19 19:56:10,957 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo:
>         Application application_1424003606313_0012 requests cleared
>         2015-02-19 19:56:10,990 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         Application application_1424003606313_0012 failed 2 times due
>         to AM Container for appattempt_1424003606313_0012_000002
>         exited with exitCode: 1 due to: Exception from container-launch.
>         Container id: container_1424003606313_0012_02_000001
>         Exit code: 1
>         Stack trace: ExitCodeException exitCode=1:
>             at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>             at org.apache.hadoop.util.Shell.run(Shell.java:455)
>             at
>         org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>             at
>         org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>             at
>         org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>             at
>         org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>             at
>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>             at
>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>             at java.lang.Thread.run(Thread.java:745)
>
>
>         Container exited with a non-zero exit code 1
>         .Failing this attempt.. Failing the application.
>         2015-02-19 19:56:10,990 INFO
>         org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>         application_1424003606313_0012 State change from FINAL_SAVING
>         to FAILED
>         2015-02-19 19:56:10,991 WARN
>         org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
>         USER=cloudera    OPERATION=Application Finished - Failed   
>         TARGET=RMAppManager    RESULT=FAILURE DESCRIPTION=App failed
>         with state: FAILED PERMISSIONS=Application
>         application_1424003606313_0012 failed 2 times due to AM
>         Container for appattempt_1424003606313_0012_000002 exited with
>         exitCode: 1 due to: Exception from container-launch.
>         Container id: container_1424003606313_0012_02_000001
>         Exit code: 1
>         Stack trace: ExitCodeException exitCode=1:
>             at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>             at org.apache.hadoop.util.Shell.run(Shell.java:455)
>             at
>         org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>             at
>         org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>             at
>         org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>             at
>         org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>             at
>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>             at
>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>             at java.lang.Thread.run(Thread.java:745)
>
>
>
>         *From:*Xuan Gong [mailto:xgong@hortonworks.com]
>         *Sent:* Thursday, February 19, 2015 8:23 PM
>         *To:* user@hadoop.apache.org <mailto:user@hadoop.apache.org>
>         *Subject:* Re: Yarn AM is abending job when submitting a
>         remote job to cluster
>
>         Hey, Roland:
>
>           Could you also check the RM logs for this application,
>         please ? Maybe we could find something there.
>
>         Thanks
>
>         Xuan Gong
>
>         *From: *Roland DePratti <roland.depratti@cox.net
>         <mailto:roland.depratti@cox.net>>
>         *Reply-To: *"user@hadoop.apache.org
>         <mailto:user@hadoop.apache.org>" <user@hadoop.apache.org
>         <mailto:user@hadoop.apache.org>>
>         *Date: *Thursday, February 19, 2015 at 5:11 PM
>         *To: *"user@hadoop.apache.org <mailto:user@hadoop.apache.org>"
>         <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
>         *Subject: *RE: Yarn AM is abending job when submitting a
>         remote job to cluster
>
>         No, I hear you.
>
>         I was just stating that the fact that hdfs works, there is
>         something right about the connectivity, that’s all, i.e.
>         Server is reachable, hadoop was able to process the request –
>         but like you said, doesn’t mean yarn works.
>
>         I tried both your solution and Alex’s solution unfortunately
>         without any improvement.
>
>         Here is the command I am executing:
>
>         hadoop jar avgWordlength.jar  solution.AvgWordLength -conf
>         ~/conf/hadoop-cluster.xml /user/cloudera/shakespeare wordlength4
>
>         Here is the new hadoop-cluseter.xml
>
>         <?xml version="1.0" encoding="UTF-8"?>
>
>         <!--generated by Roland-->
>         <configuration>
>           <property>
>             <name>fs.defaultFS</name>
>         <value>hdfs://hadoop0.rdpratti.com:8020</value>
>           </property>
>           <property>
>         <name>mapreduce.jobtracker.address</name>
>         <value>hadoop0.rdpratti.com:8032</value>
>           </property>
>           <property>
>         <name>yarn.resourcemanager.address</name>
>         <value>hadoop0.rdpratti.com:8032</value>
>           </property>
>
>
>
>
>         I also deleted the .staging directory under the submitting
>         user. Plus restarted Job History Server.
>
>         Resubmitted the job with the same result. Here is the log:
>
>         2015-02-19 19:56:05,061 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1424003606313_0012_000002
>
>         2015-02-19 19:56:05,468 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
>
>         2015-02-19 19:56:05,471 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>
>         2015-02-19 19:56:05,471 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
>
>         2015-02-19 19:56:05,473 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
>
>         2015-02-19 19:56:05,476 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
>
>         2015-02-19 19:56:05,490 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>
>         2015-02-19 19:56:05,621 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
>
>         2015-02-19 19:56:05,621 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@3909f88f  <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@3909f88f>)
>
>         2015-02-19 19:56:05,684 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
>
>         2015-02-19 19:56:05,923 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
>
>         2015-02-19 19:56:05,925 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>
>         2015-02-19 19:56:05,929 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
>
>         2015-02-19 19:56:05,930 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
>
>         2015-02-19 19:56:05,934 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
>
>         2015-02-19 19:56:05,958 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>
>         2015-02-19 19:56:06,529 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
>
>         2015-02-19 19:56:06,719 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
>
>         2015-02-19 19:56:06,837 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
>
>         2015-02-19 19:56:06,881 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
>
>         2015-02-19 19:56:06,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
>
>         2015-02-19 19:56:06,882 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
>
>         2015-02-19 19:56:06,883 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
>
>         2015-02-19 19:56:06,884 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
>
>         2015-02-19 19:56:06,885 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
>
>         2015-02-19 19:56:06,885 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
>
>         2015-02-19 19:56:06,886 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
>
>         2015-02-19 19:56:06,899 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Recovery is enabled. Will try to recover from previous life on best effort basis.
>
>         2015-02-19 19:56:06,918 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history file is at hdfs://hadoop0.rdpratti.com:8020/user/cloudera/.staging/job_1424003606313_0012/job_1424003606313_0012_1.jhist
>
>         2015-02-19 19:56:07,377 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Read completed tasks from history 0
>
>         2015-02-19 19:56:07,423 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
>
>         2015-02-19 19:56:07,453 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>
>         2015-02-19 19:56:07,507 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>
>         2015-02-19 19:56:07,507 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
>
>         2015-02-19 19:56:07,515 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1424003606313_0012 to jobTokenSecretManager
>
>         2015-02-19 19:56:07,536 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1424003606313_0012 because: not enabled; too much RAM;
>
>         2015-02-19 19:56:07,555 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1424003606313_0012 = 5343207. Number of splits = 5
>
>         2015-02-19 19:56:07,557 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1424003606313_0012 = 1
>
>         2015-02-19 19:56:07,557 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1424003606313_0012Job Transitioned from NEW to INITED
>
>         2015-02-19 19:56:07,558 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1424003606313_0012.
>
>         2015-02-19 19:56:07,618 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
>
>         2015-02-19 19:56:07,630 INFO [Socket Reader #1 for port 46841] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 46841
>
>         2015-02-19 19:56:07,648 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
>
>         2015-02-19 19:56:07,648 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>
>         2015-02-19 19:56:07,649 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at hadoop0.rdpratti.com/192.168.2.253:46841
>
>         2015-02-19 19:56:07,650 INFO [IPC Server listener on 46841] org.apache.hadoop.ipc.Server: IPC Server listener on 46841: starting
>
>         2015-02-19 19:56:07,721 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>
>         2015-02-19 19:56:07,727 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
>
>         2015-02-19 19:56:07,739 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
>
>         2015-02-19 19:56:07,745 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
>
>         2015-02-19 19:56:07,745 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
>
>         2015-02-19 19:56:07,749 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
>
>         2015-02-19 19:56:07,749 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
>
>         2015-02-19 19:56:07,760 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 39939
>
>         2015-02-19 19:56:07,760 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
>
>         2015-02-19 19:56:07,789 INFO [main] org.mortbay.log: Extractjar:file:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/jars/hadoop-yarn-common-2.5.0-cdh5.3.0.jar!/webapps/mapreduce  <jar:file:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/jars/hadoop-yarn-common-2.5.0-cdh5.3.0.jar%21/webapps/mapreduce>  to /tmp/Jetty_0_0_0_0_39939_mapreduce____.o5qk0w/webapp
>
>         2015-02-19 19:56:08,156 INFO [main] org.mortbay.log: StartedHttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:39939  <mailto:HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:39939>
>
>         2015-02-19 19:56:08,157 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 39939
>
>         2015-02-19 19:56:08,629 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
>
>         2015-02-19 19:56:08,634 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
>
>         2015-02-19 19:56:08,635 INFO [Socket Reader #1 for port 43858] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 43858
>
>         2015-02-19 19:56:08,639 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>
>         2015-02-19 19:56:08,642 INFO [IPC Server listener on 43858] org.apache.hadoop.ipc.Server: IPC Server listener on 43858: starting
>
>         2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
>
>         2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
>
>         2015-02-19 19:56:08,663 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
>
>         2015-02-19 19:56:08,797 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert;  Ignoring.
>
>         2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>
>         2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  Ignoring.
>
>         2015-02-19 19:56:08,798 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class;  Ignoring.
>
>         2015-02-19 19:56:08,799 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  Ignoring.
>
>         2015-02-19 19:56:08,809 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>
>         2015-02-19 19:56:08,821 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/192.168.2.185:8030
>
>         2015-02-19 19:56:08,975 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
>         2015-02-19 19:56:08,976 WARN [main] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
>         2015-02-19 19:56:08,976 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
>         2015-02-19 19:56:08,981 ERROR [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while registering
>
>         org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
>                  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>
>                  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>
>                  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>
>                  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>
>                  at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>
>                  at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
>
>                  at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
>
>                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>                  at java.lang.reflect.Method.invoke(Method.java:606)
>
>                  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>
>                  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>
>                  at com.sun.proxy.$Proxy36.registerApplicationMaster(Unknown Source)
>
>                  at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:161)
>
>                  at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
>
>                  at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:238)
>
>                  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>
>                  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:807)
>
>                  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>
>                  at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>
>                  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1075)
>
>                  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>
>                  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1478)
>
>                  at java.security.AccessController.doPrivileged(Native Method)
>
>                  at javax.security.auth.Subject.doAs(Subject.java:415)
>
>                  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>
>                  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1474)
>
>                  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
>
>         Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
>                  at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>
>                  at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>
>                  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>
>                  at com.sun.proxy.$Proxy35.registerApplicationMaster(Unknown Source)
>
>                  at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
>
>                  ... 22 more
>
>         2015-02-19 19:56:08,983 INFO [main] org.apache.hadoop.service.AbstractService: Service RMCommunicator failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
>         org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1424003606313_0012_000002 not found in AMRMTokenSecretManager.
>
>         *From:*Ulul [mailto:hadoop@ulul.org]
>         *Sent:* Thursday, February 19, 2015 5:08 PM
>         *To:* user@hadoop.apache.org <mailto:user@hadoop.apache.org>
>         *Subject:* Re: Yarn AM is abending job when submitting a
>         remote job to cluster
>
>         Is your point is that using the hdfs:// prefix is valid since
>         our hdfs client works ?
>         fs.defaultFS defines the namenode address and the filesystem
>         type. It doen't imply that the prefix should be used for yarn
>         and mapreduce options that are not directly linked to hdfs
>
>
>
>
>         Le 19/02/2015 22:56, Ulul a écrit :
>
>             In that case it's just between your hdfs client, the NN
>             and the DNs, no YARN or MR component involved.
>             The fact that this works is not related to your MR job not
>             succeeding.
>
>
>
>
>             Le 19/02/2015 22:45, roland.depratti a écrit :
>
>                 Thanks for looking at my problem.
>
>                 I can run an hdfs command from the client, with the
>                 config file listed, that does a cat on a file in hdfs
>                 on the remote cluster and returns the contents of that
>                 file to the client.
>
>                 - rd
>
>                 Sent from my Verizon Wireless 4G LTE smartphone
>
>
>
>                 -------- Original message --------
>                 From: Ulul <hadoop@ulul.org> <mailto:hadoop@ulul.org>
>                 Date:02/19/2015 4:03 PM (GMT-05:00)
>                 To: user@hadoop.apache.org
>                 <mailto:user@hadoop.apache.org>
>                 Subject: Re: Yarn AM is abending job when submitting a
>                 remote job to cluster
>
>                 Hi
>                 Doesn't seem like an ssl error to me (the log states
>                 that attempts to
>                 override final properties are ignored)
>
>                 On the other hand the configuration seems wrong
>                 :mapreduce.jobtracker.address and
>                 yarn.resourcemanager.address should
>                 only contain an IP or a hostname. You should remove
>                 'hdfs://' though the
>                 log doesn't suggest it has anything to do with your
>                 problem....
>
>                 And what do you mean by an "HDFS job" ?
>
>                 Ulul
>
>                 Le 19/02/2015 04:22, daemeon reiydelle a écrit :
>                 > I would guess you do not have your ssl certs set up,
>                 client or server,
>                 > based on the error.
>                 >
>                 > ***
>                 > .......
>                 > ***“Life should not be a journey to the grave with
>                 the intention of
>                 > arriving safely in a
>                 > pretty and well preserved body, but rather to skid
>                 in broadside in a
>                 > cloud of smoke,
>                 > thoroughly used up, totally worn out, and loudly
>                 proclaiming “Wow!
>                 > What a Ride!”*
>                 > - Hunter Thompson
>                 >
>                 > Daemeon C.M. Reiydelle
>                 > USA (+1) 415.501.0198
>                 > London (+44) (0) 20 8144 9872*/
>                 > /
>                 >
>                 > On Wed, Feb 18, 2015 at 5:19 PM, Roland DePratti
>                 > <roland.depratti@cox.net
>                 <mailto:roland.depratti@cox.net>
>                 <mailto:roland.depratti@cox.net>
>                 <mailto:roland.depratti@cox.net>> wrote:
>                 >
>                 >     I have been searching for a handle on a problem
>                 without very
>                 >     little clues. Any help pointing me to the right
>                 direction will be
>                 >     huge.
>                 >
>                 >     I have not received any input form the Cloudera
>                 google groups.
>                 >     Perhaps this is more Yarn based and I am hoping
>                 I have more luck here.
>                 >
>                 >     Any help is greatly appreciated.
>                 >
>                 >     I am running a Hadoop cluster using CDH5.3. I
>                 also have a client
>                 >     machine with a standalone one node setup (VM).
>                 >
>                 >     All environments are running CentOS 6.6.
>                 >
>                 >     I have submitted some Java mapreduce jobs
>                 locally on both the
>                 >     cluster and the standalone environment with
>                 successfully completions.
>                 >
>                 >     I can submit a remote HDFS job from client to
>                 cluster using -conf
>                 >     hadoop-cluster.xml (see below) and get data back
>                 from the cluster
>                 >     with no problem.
>                 >
>                 >     When submitted remotely the mapreduce jobs
>                 remotely, I get an AM
>                 >     error:
>                 >
>                 >     AM fails the job with the error:
>                 >
>                 >
>                 >                SecretManager$InvalidToken:
>                 >     appattempt_1424003606313_0001_000002 not found in
>                 >     AMRMTokenSecretManager
>                 >
>                 >
>                 >     I searched /var/log/secure on the client and
>                 cluster with no
>                 >     unusual messages.
>                 >
>                 >     Here is the contents of hadoop-cluster.xml:
>                 >
>                 >     <?xml version="1.0" encoding="UTF-8"?>
>                 >
>                 >     <!--generated by Roland-->
>                 >     <configuration>
>                 >       <property>
>                 >         <name>fs.defaultFS</name>
>                 > <value>hdfs://mycluser:8020</value>
>                 >       </property>
>                 >       <property>
>                 > <name>mapreduce.jobtracker.address</name>
>                 > <value>hdfs://mycluster:8032</value>
>                 >       </property>
>                 >       <property>
>                 > <name>yarn.resourcemanager.address</name>
>                 > <value>hdfs://mycluster:8032</value>
>                 >       </property>
>                 >
>                 >     Here is the output from the job log on the cluster:
>                 >
>                 >     2015-02-15 07:51:06,544 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created
>                 >     MRAppMaster for application
>                 appattempt_1424003606313_0001_000002
>                 >
>                 >     2015-02-15 07:51:06,949 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.require.client.cert;  Ignoring.
>                 >
>                 >     2015-02-15 07:51:06,952 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 > mapreduce.job.end-notification.max.retry.interval;
>                 Ignoring.
>                 >
>                 >     2015-02-15 07:51:06,952 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.client.conf;  Ignoring.
>                 >
>                 >     2015-02-15 07:51:06,954 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.keystores.factory.class;
>                 >     Ignoring.
>                 >
>                 >     2015-02-15 07:51:06,957 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.server.conf;  Ignoring.
>                 >
>                 >     2015-02-15 07:51:06,973 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 > mapreduce.job.end-notification.max.attempts; Ignoring.
>                 >
>                 >     2015-02-15 07:51:07,241 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>                 Executing with tokens:
>                 >
>                 >     2015-02-15 07:51:07,241 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
>                 >     YARN_AM_RM_TOKEN, Service: , Ident:
>                 >    
>                 (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0
>                 <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0>)
>                 >
>                 >     2015-02-15 07:51:07,332 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>                 Using mapred
>                 >     newApiCommitter.
>                 >
>                 >     2015-02-15 07:51:07,627 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.require.client.cert;  Ignoring.
>                 >
>                 >     2015-02-15 07:51:07,632 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 > mapreduce.job.end-notification.max.retry.interval;
>                 Ignoring.
>                 >
>                 >     2015-02-15 07:51:07,632 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.client.conf;  Ignoring.
>                 >
>                 >     2015-02-15 07:51:07,639 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.keystores.factory.class;
>                 >     Ignoring.
>                 >
>                 >     2015-02-15 07:51:07,645 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 hadoop.ssl.server.conf;  Ignoring.
>                 >
>                 >     2015-02-15 07:51:07,663 WARN [main]
>                 >     org.apache.hadoop.conf.Configuration: job.xml:an
>                 attempt to
>                 >     override final parameter:
>                 > mapreduce.job.end-notification.max.attempts; Ignoring.
>                 >
>                 >     2015-02-15 07:51:08,237 WARN [main]
>                 >     org.apache.hadoop.util.NativeCodeLoader: Unable
>                 to load
>                 >     native-hadoop library for your platform... using
>                 builtin-java
>                 >     classes where applicable
>                 >
>                 >     2015-02-15 07:51:08,429 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>                 OutputCommitter
>                 >     set in config null
>                 >
>                 >     2015-02-15 07:51:08,499 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>                 OutputCommitter is
>                 >
>                 org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
>                 >
>                 >     2015-02-15 07:51:08,526 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 > org.apache.hadoop.mapreduce.jobhistory.EventType for
>                 class
>                 >
>                 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
>                 >
>                 >     2015-02-15 07:51:08,527 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType
>                 for
>                 >     class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
>                 >
>                 >     2015-02-15 07:51:08,561 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType
>                 for
>                 >     class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
>                 >
>                 >     2015-02-15 07:51:08,562 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType
>                 >     for class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
>                 >
>                 >     2015-02-15 07:51:08,566 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType
>                 for
>                 >     class
>                 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
>                 >
>                 >     2015-02-15 07:51:08,568 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType
>                 >     for class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
>                 >
>                 >     2015-02-15 07:51:08,568 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType
>                 >     for class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
>                 >
>                 >     2015-02-15 07:51:08,570 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType
>                 >     for class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
>                 >
>                 >     2015-02-15 07:51:08,599 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>                 Recovery is
>                 >     enabled. Will try to recover from previous life
>                 on best effort basis.
>                 >
>                 >     2015-02-15 07:51:08,642 INFO [main]
>                 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>                 Previous history
>                 >     file is at
>                 >
>                 hdfs://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist
>                 >
>                 <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>
>                 <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>
>                 >
>                 >     _2015-02-15
>                 >
>                 <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>
>                 <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>_07:51:09,147
>                 >     INFO [main]
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Read
>                 >     completed tasks from history 0
>                 >
>                 >     2015-02-15 07:51:09,193 INFO [main]
>                 > org.apache.hadoop.yarn.event.AsyncDispatcher:
>                 Registering class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type
>                 >     for class
>                 >
>                 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
>                 >
>                 >     2015-02-15 07:51:09,222 INFO [main]
>                 > org.apache.hadoop.metrics2.impl.MetricsConfig:
>                 loaded properties
>                 >     from hadoop-metrics2.properties
>                 >
>                 >     2015-02-15 07:51:09,277 INFO [main]
>


Mime
View raw message