hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma" <jssa...@facebook.com>
Subject extremely slow reduce jobs
Date Fri, 03 Aug 2007 19:17:37 GMT
I have a fairly simple job with a map, a local combiner and a reduce.
The combiner and the reduce do the equivalent of a group_concat (mysql).
 
 
I have horrible performance in the reduce stage:
- the map jobs are done
- all the reduce jobs claim they are copying data - but the copy rate is
abysmal (0.5MBps)
  - checked the network topology - everything's on GigE and on same
switch. (80 machine cluster)
  - seeing 50+ MBps bandwidth between any pair using scp
- when I look at the machines where reduce is running - vmstat says 0%
cpu util.
 
A sample reducetask log is below. Job conf: 64 way reduce. I specified
the map tasks to the same number - but hadoop is anyway creating 386 map
tasks. 
 
Anyone has some quick hints on what could be going wrong?
 
Thanks,
 
Joydeep
 
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message