hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From xu cheng <xcheng....@gmail.com>
Subject questions about the order of the map and reduce and the shuffle error
Date Fri, 27 Aug 2010 09:43:20 GMT
hello guys:
   I'm doing some experiences on my 3 node virtual machine cluster, one for
namenode and jobtracker while the other tow for datanode and
tasktracker.with a 0.21.0 hadoop
   and when  I 'm running a job ,I got such message


10/08/27 17:28:58 INFO mapreduce.Job:  map 0% reduce 0%
10/08/27 17:29:10 INFO mapreduce.Job:  map 50% reduce 0%
10/08/27 17:29:12 INFO mapreduce.Job:  map 83% reduce 0%
10/08/27 17:29:19 INFO mapreduce.Job:  map 83% reduce 16%
10/08/27 17:29:24 INFO mapreduce.Job:  map 100% reduce 16%


the reduce runs while the map task hasn't finished!! ( I read from the books
that reduce task runs exactlly after the maps finish!!)

is there something wrong with the cluster or my knowledge?

by the way ,the reduce job stuck while it is in the progress 16.63%, I found
that people on the internet also got this problem but I haven't found the
solution.

however, after some time , after the system report the error message , the
job began to run again! like this


10/08/27 17:29:58 INFO mapreduce.Job:  map 100% reduce 16%
10/08/27 17:30:01 INFO mapreduce.Job:  map 50% reduce 16%
10/08/27 17:30:07 INFO mapreduce.Job:  map 83% reduce 16%
10/08/27 17:30:19 INFO mapreduce.Job:  map 100% reduce 16%
10/08/27 17:30:25 INFO mapreduce.Job:  map 100% reduce 66%
10/08/27 17:30:31 INFO mapreduce.Job:  map 100% reduce 100%


and it runs pretty well!  does someone know about this?

belows are the message on the console, If the logs are needed ,let me
know.thanks

any suggestions and references are appreciated
best regards
xu








10/08/27 17:28:56 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=300000
10/08/27 17:28:56 WARN conf.Configuration: mapred.task.id is deprecated.
Instead, use mapreduce.task.attempt.id
10/08/27 17:28:56 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
10/08/27 17:28:56 INFO input.FileInputFormat: Total input paths to process :
1
10/08/27 17:28:56 WARN conf.Configuration: mapred.map.tasks is deprecated.
Instead, use mapreduce.job.maps
10/08/27 17:28:56 INFO mapreduce.JobSubmitter: number of splits:2
10/08/27 17:28:57 INFO mapreduce.JobSubmitter: adding the following
namenodes' delegation tokens:null
10/08/27 17:28:57 INFO mapreduce.Job: Running job: job_201008271725_0001
10/08/27 17:28:58 INFO mapreduce.Job:  map 0% reduce 0%
10/08/27 17:29:10 INFO mapreduce.Job:  map 50% reduce 0%
10/08/27 17:29:12 INFO mapreduce.Job:  map 83% reduce 0%
10/08/27 17:29:19 INFO mapreduce.Job:  map 83% reduce 16%
10/08/27 17:29:24 INFO mapreduce.Job:  map 100% reduce 16%
10/08/27 17:29:48 INFO mapreduce.Job: Task Id :
attempt_201008271725_0001_r_000000_0, Status : FAILED
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in
shuffle in fetcher#1
 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
 at org.apache.hadoop.mapred.Child.main(Child.java:211)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
bailing-out.
 at
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253)
 at
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187)
 at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:234)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
10/08/27 17:29:48 WARN mapreduce.Job: Error reading task outputConnection
refused
10/08/27 17:29:48 WARN mapreduce.Job: Error reading task outputConnection
refused
10/08/27 17:29:49 INFO mapreduce.Job:  map 100% reduce 0%
10/08/27 17:29:57 INFO mapreduce.Job: Task Id :
attempt_201008271725_0001_m_000000_0, Status : FAILED
Too many fetch-failures
10/08/27 17:29:57 WARN mapreduce.Job: Error reading task outputConnection
refused
10/08/27 17:29:57 WARN mapreduce.Job: Error reading task outputConnection
refused
10/08/27 17:29:58 INFO mapreduce.Job:  map 100% reduce 16%
10/08/27 17:30:01 INFO mapreduce.Job:  map 50% reduce 16%
10/08/27 17:30:07 INFO mapreduce.Job:  map 83% reduce 16%
10/08/27 17:30:19 INFO mapreduce.Job:  map 100% reduce 16%
10/08/27 17:30:25 INFO mapreduce.Job:  map 100% reduce 66%
10/08/27 17:30:31 INFO mapreduce.Job:  map 100% reduce 100%
10/08/27 17:30:33 INFO mapreduce.Job: Job complete: job_201008271725_0001
10/08/27 17:30:33 INFO mapreduce.Job: Counters: 33
 FileInputFormatCounters
  BYTES_READ=76420532
 FileSystemCounters
  FILE_BYTES_READ=155979268
  FILE_BYTES_WRITTEN=239598906
  HDFS_BYTES_READ=76424828
  HDFS_BYTES_WRITTEN=78386951
 Shuffle Errors
  BAD_ID=0
  CONNECTION=0
  IO_ERROR=2
  WRONG_LENGTH=0
  WRONG_MAP=0
  WRONG_REDUCE=0
 Job Counters
  Data-local map tasks=3
  Total time spent by all maps waiting after reserving slots (ms)=0
  Total time spent by all reduces waiting after reserving slots (ms)=0
  SLOTS_MILLIS_MAPS=47747
  SLOTS_MILLIS_REDUCES=77241
  Launched map tasks=3
  Launched reduce tasks=2
 Map-Reduce Framework
  Combine input records=999998
  Combine output records=994179
  Failed Shuffles=1
  GC time elapsed (ms)=667
  Map input records=499999
  Map output bytes=80759850
  Map output records=999998
  Merged Map outputs=2
  Reduce input groups=993814
  Reduce input records=994179
  Reduce output records=993814
  Reduce shuffle bytes=83049802
  Shuffled Maps =2
  Spilled Records=2861309
  SPLIT_RAW_BYTES=200

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message