hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincent,Wei" <weikun0...@gmail.com>
Subject Re: About Map 100% reduce %0 issue
Date Wed, 26 Mar 2014 07:31:07 GMT
I guess there may have some problem

2014-03-26 23:13:43,900 INFO [fetcher#5]
org.apache.hadoop.mapreduce.task.reduce.Fetcher: for
url=13562/mapOutput?job=job_1395846985948_0001&reduce=0&map=attempt_1395846985948_0001_m_000000_0
sent hash and received reply

2014-03-26 23:13:43,913 WARN [fetcher#5]
org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id

java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500
Internal Server Error

Content-Type: text/plain; charset=UTF is not properly formed



2014-03-26 15:29 GMT+08:00 Vincent,Wei <weikun0905@gmail.com>:

> Hitesh , I have checked this configure , I use the default of
> yarn-site.xml, and I can see the configure is the value you have said
>
>
> <property>
>       <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
>       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>     </property>
>
>
> 2014-03-26 15:27 GMT+08:00 Vincent,Wei <weikun0905@gmail.com>:
>
> the log of container
>>
>> 2014-03-26 23:13:42,891 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>
>> 2014-03-26 23:13:42,892 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>
>> 2014-03-26 23:13:43,126 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig:
loaded properties from hadoop-metrics2.properties
>>
>> 2014-03-26 23:13:43,159 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Scheduled snapshot period at 10 second(s).
>>
>> 2014-03-26 23:13:43,159 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
ReduceTask metrics system started
>>
>> 2014-03-26 23:13:43,164 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing
with tokens:
>>
>> 2014-03-26 23:13:43,164 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job,
Service: job_1395846985948_0001, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@6699a74e)
>>
>> 2014-03-26 23:13:43,201 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping
for 0ms before retrying again. Got null now.
>>
>> 2014-03-26 23:13:43,370 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>
>> 2014-03-26 23:13:43,371 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>
>> 2014-03-26 23:13:43,405 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir
for child: /tmp/hadoop-haduser/nm-local-dir/usercache/haduser/appcache/application_1395846985948_0001
>>
>> 2014-03-26 23:13:43,454 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>
>> 2014-03-26 23:13:43,455 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>
>> 2014-03-26 23:13:43,488 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
>>
>> 2014-03-26 23:13:43,488 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
>>
>> 2014-03-26 23:13:43,489 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.tip.id is deprecated. Instead, use mapreduce.task.id
>>
>> 2014-03-26 23:13:43,489 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
>>
>> 2014-03-26 23:13:43,489 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir
>>
>> 2014-03-26 23:13:43,490 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
job.local.dir is deprecated. Instead, use mapreduce.job.local.dir
>>
>> 2014-03-26 23:13:43,490 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
>>
>> 2014-03-26 23:13:43,490 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.job.id is deprecated. Instead, use mapreduce.job.id
>>
>> 2014-03-26 23:13:43,602 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
session.id is deprecated. Instead, use dfs.metrics.session-id
>>
>> 2014-03-26 23:13:43,773 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree
: [ ]
>>
>> 2014-03-26 23:13:43,801 INFO [main] org.apache.hadoop.mapred.ReduceTask: Using ShuffleConsumerPlugin:
org.apache.hadoop.mapreduce.task.reduce.Shuffle@5f4960eb
>>
>> 2014-03-26 23:13:43,810 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
MergerManager: memoryLimit=140928608, maxSingleShuffleLimit=35232152, mergeThreshold=93012888,
ioSortFactor=10, memToMemMergeOutputsThreshold=10
>>
>> 2014-03-26 23:13:43,812 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher:
attempt_1395846985948_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion
Events
>>
>> 2014-03-26 23:13:43,816 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher:
attempt_1395846985948_0001_r_000000_0: Got 1 new map-outputs
>>
>> 2014-03-26 23:13:43,816 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl:
Assigning slave1:13562 with 1 to fetcher#5
>>
>> 2014-03-26 23:13:43,816 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl:
assigned 1 of 1 to slave1:13562 to fetcher#5
>>
>> 2014-03-26 23:13:43,900 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.Fetcher:
for url=13562/mapOutput?job=job_1395846985948_0001&reduce=0&map=attempt_1395846985948_0001_m_000000_0
sent hash and received reply
>>
>> 2014-03-26 23:13:43,913 WARN [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.Fetcher:
Invalid map id
>>
>> java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal Server
Error
>>
>> Content-Type: text/plain; charset=UTF is not properly formed
>>
>>         at org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:201)
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:386)
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>>
>> 2014-03-26 23:13:43,914 WARN [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.Fetcher:
copyMapOutput failed for tasks [attempt_1395846985948_0001_m_000000_0]
>>
>> 2014-03-26 23:13:43,914 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl:
Reporting fetch failure for attempt_1395846985948_0001_m_000000_0 to jobtracker.
>>
>> 2014-03-26 23:13:43,914 FATAL [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl:
Shuffle failed with too many fetch failures and insufficient progress!
>>
>> 2014-03-26 23:13:43,915 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl:
slave1:13562 freed by fetcher#5 in 99ms
>>
>> 2014-03-26 23:13:43,915 ERROR [main] org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:haduser (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
error in shuffle in fetcher#5
>>
>> 2014-03-26 23:13:43,915 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception
running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle
in fetcher#5
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121)
>>
>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
>>
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>
>> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>>
>>         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>>
>>
>>
>> 2014-03-26 23:13:43,918 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup
for the task
>>
>> 2014-03-26 23:13:43,929 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter:
Could not delete hdfs://master:9000/output/15/_temporary/1/_temporary/attempt_1395846985948_0001_r_000000_0
>>
>> 2014-03-26 23:13:44,032 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Stopping ReduceTask metrics system...
>>
>> 2014-03-26 23:13:44,032 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
ReduceTask metrics system stopped.
>>
>> 2014-03-26 23:13:44,032 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
ReduceTask metrics system shutdown complete.
>>
>>
>>
>> 2014-03-26 0:25 GMT+08:00 Hitesh Shah <hitesh@apache.org>:
>>
>> You are missing the following in your yarn site:
>>>
>>>     <property>
>>>       <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
>>>       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>>>     </property>
>>>
>>> You will need to restart the nodemanager for this property to take
>>> effect.
>>>
>>> -- Hitesh
>>>
>>> On Mar 24, 2014, at 9:34 PM, Vincent,Wei wrote:
>>>
>>> > All
>>> >
>>> > I am a new comer for Hadoop, I have run
>>> > the hadoop-mapreduce-examples-2.2.0.jar wordcount, but the result is
>>> that
>>> > it always pending at map 100% and reduce %0.
>>> >
>>> > 14/03/25 20:19:20 INFO client.RMProxy: Connecting to ResourceManager at
>>> > master/159.99.249.63:8032
>>> > 14/03/25 20:19:20 INFO input.FileInputFormat: Total input paths to
>>> process
>>> > : 1
>>> > 14/03/25 20:19:20 INFO mapreduce.JobSubmitter: number of splits:1
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: user.name is
>>> deprecated.
>>> > Instead, use mapreduce.job.user.name
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: mapred.jar is
>>> deprecated.
>>> > Instead, use mapreduce.job.jar
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation:
>>> mapred.output.value.class
>>> > is deprecated. Instead, use mapreduce.job.output.value.class
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation:
>>> mapreduce.combine.class
>>> > is deprecated. Instead, use mapreduce.job.combine.class
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: mapreduce.map.class
>>> is
>>> > deprecated. Instead, use mapreduce.job.map.class
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: mapred.job.name is
>>> > deprecated. Instead, use mapreduce.job.name
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation:
>>> mapreduce.reduce.class is
>>> > deprecated. Instead, use mapreduce.job.reduce.class
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: mapred.input.dir is
>>> > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: mapred.output.dir is
>>> > deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: mapred.map.tasks is
>>> > deprecated. Instead, use mapreduce.job.maps
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation:
>>> mapred.output.key.class
>>> > is deprecated. Instead, use mapreduce.job.output.key.class
>>> > 14/03/25 20:19:20 INFO Configuration.deprecation: mapred.working.dir is
>>> > deprecated. Instead, use mapreduce.job.working.dir
>>> > 14/03/25 20:19:20 INFO mapreduce.JobSubmitter: Submitting tokens for
>>> job:
>>> > job_1395747600383_0002
>>> > 14/03/25 20:19:20 INFO impl.YarnClientImpl: Submitted application
>>> > application_1395747600383_0002 to ResourceManager at master/
>>> > 159.99.249.63:8032
>>> > 14/03/25 20:19:20 INFO mapreduce.Job: The url to track the job:
>>> > http://master:8088/proxy/application_1395747600383_0002/
>>> > 14/03/25 20:19:20 INFO mapreduce.Job: Running job:
>>> job_1395747600383_0002
>>> > 14/03/25 20:19:24 INFO mapreduce.Job: Job job_1395747600383_0002
>>> running in
>>> > uber mode : false
>>> > 14/03/25 20:19:24 INFO mapreduce.Job:  map 0% reduce 0%
>>> > 14/03/25 20:19:28 INFO mapreduce.Job:  map 100% reduce 0%
>>> > 14/03/25 20:19:31 INFO mapreduce.Job: Task Id :
>>> > attempt_1395747600383_0002_r_000000_0, Status : FAILED
>>> > Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>> error
>>> > in shuffle in fetcher#5
>>> >        at
>>> > org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121)
>>> >        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
>>> >        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>> >        at java.security.AccessController.doPrivileged(Native Method)
>>> >        at javax.security.auth.Subject.doAs(Subject.java:415)
>>> >        at
>>> >
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>> >        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>> > Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
>>> > bailing-out.
>>> >        at
>>> >
>>> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>>> >        at
>>> >
>>> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>>> >        at
>>> >
>>> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>>> >        at
>>> > org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>>> >
>>> > someone says that this is caused by hosts configure .I have checked my
>>> > /etc/hosts  on all Mater & slaves:
>>> > 127.0.0.1       localhost.localdomain localhost
>>> > 159.99.249.63   master
>>> > 159.99.249.203  slave1
>>> > 159.99.249.99   slave2
>>> > 159.99.249.88   slave3
>>> >
>>> > Would you please help me to fix the issue, many thanks .
>>> >
>>> > my yarn-site.xml
>>> >
>>> > <?xml version="1.0"?>
>>> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> >
>>> >
>>> > <configuration>
>>> >
>>> > <property>
>>> > <description>The hostname of the RM.</description>
>>> > <name>yarn.resourcemanager.hostname</name>
>>> > <value>master</value>
>>> > </property>
>>> >
>>> > <property>
>>> > <name>yarn.nodemanager.aux-services</name>
>>> > <value>mapreduce_shuffle</value>
>>> > </property>
>>> >
>>> > <property>
>>> > <description>The address of the container manager in the
>>> NM.</description>
>>> > <name>yarn.nodemanager.address</name>
>>> > <value>${yarn.nodemanager.hostname}:8041</value>
>>> > </property>
>>> >
>>> > </configuration>
>>> >
>>> > my mapred-site.xml
>>> >
>>> > <configuration>
>>> >        <property>
>>> >        <name>mapreduce.framework.name</name>
>>> >        <value>yarn</value>
>>> >        </property>
>>> >
>>> > <property>
>>> >  <name>mapreduce.reduce.shuffle.merge.percent</name>
>>> >  <value>0.33</value>
>>> >  <description>The usage threshold at which an in-memory merge will
be
>>> >  initiated, expressed as a percentage of the total memory allocated to
>>> >  storing in-memory map outputs, as defined by
>>> >  mapreduce.reduce.shuffle.input.buffer.percent.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapreduce.reduce.shuffle.input.buffer.percent</name>
>>> >  <value>0.35</value>
>>> >  <description>The percentage of memory to be allocated from the maximum
>>> > heap
>>> >  size to storing map outputs during the shuffle.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapreduce.reduce.shuffle.memory.limit.percent</name>
>>> >  <value>0.12</value>
>>> >  <description>Expert: Maximum percentage of the in-memory limit that
a
>>> >  single shuffle can consume</description>
>>> > </property>
>>> >
>>> > </configuration>
>>> >
>>> >
>>> > --
>>> > BR,
>>> >
>>> > Vincent.Wei
>>>
>>>
>>
>>
>> --
>> BR,
>>
>> Vincent.Wei
>>
>
>
>
> --
> BR,
>
> Vincent.Wei
>



-- 
BR,

Vincent.Wei

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message