hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Silvina Caíno Lores <silvi.ca...@gmail.com>
Subject Re: Job stuck in running state on Hadoop 2.2.0
Date Tue, 10 Dec 2013 09:49:16 GMT
Thank you! I realized that, despite I exported the variables in the
scripts, there were a few errors and my desired configuration wasn't being
used (which explained other strange behavior).

However, I'm still getting the same issue with the examples, for instance:

hadoop jar
~/hadoop-2.2.0-maven/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-SNAPSHOT.jar
pi 1 100
Number of Maps = 1
Samples per Map = 100
13/12/10 10:41:18 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Starting Job
13/12/10 10:41:19 INFO client.RMProxy: Connecting to ResourceManager at /
0.0.0.0:8032
13/12/10 10:41:20 INFO input.FileInputFormat: Total input paths to process
: 1
13/12/10 10:41:20 INFO mapreduce.JobSubmitter: number of splits:1
13/12/10 10:41:20 INFO Configuration.deprecation: user.name is deprecated.
Instead, use mapreduce.job.user.name
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.jar is deprecated.
Instead, use mapreduce.job.jar
13/12/10 10:41:20 INFO Configuration.deprecation:
mapred.map.tasks.speculative.execution is deprecated. Instead, use
mapreduce.map.speculative
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.reduce.tasks is
deprecated. Instead, use mapreduce.job.reduces
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.output.value.class
is deprecated. Instead, use mapreduce.job.output.value.class
13/12/10 10:41:20 INFO Configuration.deprecation:
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
mapreduce.reduce.speculative
13/12/10 10:41:20 INFO Configuration.deprecation: mapreduce.map.class is
deprecated. Instead, use mapreduce.job.map.class
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.job.name is
deprecated. Instead, use mapreduce.job.name
13/12/10 10:41:20 INFO Configuration.deprecation: mapreduce.reduce.class is
deprecated. Instead, use mapreduce.job.reduce.class
13/12/10 10:41:20 INFO Configuration.deprecation:
mapreduce.inputformat.class is deprecated. Instead, use
mapreduce.job.inputformat.class
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.input.dir is
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.output.dir is
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/12/10 10:41:20 INFO Configuration.deprecation:
mapreduce.outputformat.class is deprecated. Instead, use
mapreduce.job.outputformat.class
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.map.tasks is
deprecated. Instead, use mapreduce.job.maps
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.output.key.class
is deprecated. Instead, use mapreduce.job.output.key.class
13/12/10 10:41:20 INFO Configuration.deprecation: mapred.working.dir is
deprecated. Instead, use mapreduce.job.working.dir
13/12/10 10:41:20 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1386668372725_0001
13/12/10 10:41:20 INFO impl.YarnClientImpl: Submitted application
application_1386668372725_0001 to ResourceManager at /0.0.0.0:8032
13/12/10 10:41:21 INFO mapreduce.Job: The url to track the job:
http://compute-7-2:8088/proxy/application_1386668372725_0001/
13/12/10 10:41:21 INFO mapreduce.Job: Running job: job_1386668372725_0001
13/12/10 10:41:31 INFO mapreduce.Job: Job job_1386668372725_0001 running in
uber mode : false
13/12/10 10:41:31 INFO mapreduce.Job: map 0% reduce 0%
---- stuck here ----


I hope the problem is not in the environment files. I have the following at
the beginning of hadoop-env.sh:

# The java implementation to use.
export JAVA_HOME=/home/software/jdk1.7.0_25/

# The jsvc implementation to use. Jsvc is required to run secure datanodes.
#export JSVC_HOME=${JSVC_HOME}

export
HADOOP_INSTALL=/home/scaino/hadoop-2.2.0-maven/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT

export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_CONF_DIR=$HADOOP_INSTALL"/etc/hadoop"


and this in yarn-env.sh:

export JAVA_HOME=/home/software/jdk1.7.0_25/

export
HADOOP_INSTALL=/home/scaino/hadoop-2.2.0-maven/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT

export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_CONF_DIR=$HADOOP_INSTALL"/etc/hadoop"


Not sure what to do about HADOOP_YARN_USER though, since I don't have a
dedicated user to run the demons.

Thanks!


On 10 December 2013 10:10, Taka Shinagawa <taka.epsilon@gmail.com> wrote:

> I had a similar problem after setting up Hadoop 2.2.0 based on the
> instructions at
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
>
> Although it's not documented on the page, I needed to edit hadoop-env.sh
> and yarn-env.sh as well to update
> JAVA_HOME, HADOOP_CONF_DIR, HADOOP_YARN_USER and YARN_CONF_DIR.
>
> Once these variables are set, I was able to run the example successfully.
>
>
>
> On Mon, Dec 9, 2013 at 11:37 PM, Silvina Caíno Lores <
> silvi.caino@gmail.com> wrote:
>
>>
>> Hi everyone,
>>
>> I'm having trouble running the Hadoop examples in a single node. All the
>> executions get stuck at the running state at 0% map and reduce and the logs
>> don't seem to indicate any issue, besides the need to kill the node manager:
>>
>> compute-0-7-3: nodemanager did not stop gracefully after 5 seconds:
>> killing with kill -9
>>
>> RM
>>
>> 2013-12-09 11:52:22,466 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
>> Command to launch container container_1386585879247_0001_01_000001 :
>> $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties
>> -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0
>> -Dhadoop.root.logger=INFO,CLA -Xmx1024m
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout
>> 2><LOG_DIR>/stderr
>> 2013-12-09 11:52:22,882 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done
>> launching container Container: [ContainerId:
>> container_1386585879247_0001_01_000001, NodeId: compute-0-7-3:8010,
>> NodeHttpAddress: compute-0-7-3:8042, Resource: <memory:2000, vCores:1>,
>> Priority: 0, Token: Token { kind: ContainerToken, service: 10.0.7.3:8010}, ] for
AM appattempt_1386585879247_0001_000001
>> 2013-12-09 11:52:22,883 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>> appattempt_1386585879247_0001_000001 State change from ALLOCATED to LAUNCHED
>> 2013-12-09 11:52:23,371 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
>> container_1386585879247_0001_01_000001 Container Transitioned from ACQUIRED
>> to RUNNING
>> 2013-12-09 11:52:30,922 INFO SecurityLogger.org.apache.hadoop.ipc.Server:
>> Auth successful for appattempt_1386585879247_0001_000001 (auth:SIMPLE)
>> 2013-12-09 11:52:30,938 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM
>> registration appattempt_1386585879247_0001_000001
>> 2013-12-09 11:52:30,939 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=scaino
>> IP=10.0.7.3 OPERATION=Register App Master TARGET=ApplicationMasterService
>> RESULT=SUCCESS APPID=application_1386585879247_0001
>> APPATTEMPTID=appattempt_1386585879247_0001_000001
>> 2013-12-09 11:52:30,941 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>> appattempt_1386585879247_0001_000001 State change from LAUNCHED to RUNNING
>> 2013-12-09 11:52:30,941 INFO
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
>> application_1386585879247_0001 State change from ACCEPTED to RUNNING
>>
>>
>> NM
>>
>> 2013-12-10 08:26:02,100 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
>> event CONTAINER_STOP for appId application_1386585879247_0001
>> 2013-12-10 08:26:02,102 INFO
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>> Deleting absolute path :
>> /scratch/HDFS-scaino-2/tmp/nm-local-dir/usercache/scaino/appcache/application_1386585879247_0001
>> 2013-12-10 08:26:02,103 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
>> event APPLICATION_STOP for appId application_1386585879247_0001
>> 2013-12-10 08:26:02,110 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>> Application application_1386585879247_0001 transitioned from
>> APPLICATION_RESOURCES_CLEANINGUP to FINISHED
>> 2013-12-10 08:26:02,157 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler:
>> Scheduling Log Deletion for application: application_1386585879247_0001,
>> with delay of 10800 seconds
>> 2013-12-10 08:26:04,688 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
>> Stopping resource-monitoring for container_1386585879247_0001_01_000001
>> 2013-12-10 08:26:05,838 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
>> Done waiting for Applications to be Finished. Still alive:
>> [application_1386585879247_0001]
>> 2013-12-10 08:26:05,839 INFO org.apache.hadoop.ipc.Server: Stopping
>> server on 8010
>> 2013-12-10 08:26:05,846 INFO org.apache.hadoop.ipc.Server: Stopping IPC
>> Server listener on 8010
>> 2013-12-10 08:26:05,847 INFO org.apache.hadoop.ipc.Server: Stopping IPC
>> Server Responder
>>
>> I tried the pi and wordcount examples with same results, any ideas on how
>> to debug this?
>>
>> Thanks in advance.
>>
>> Regards,
>> Silvina Caíno
>>
>>
>>
>>
>>
>
>

Mime
View raw message