hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nguyen Manh Tien" <tien.nguyenm...@gmail.com>
Subject Re: extremely slow reduce jobs
Date Sat, 04 Aug 2007 03:07:15 GMT
I met the save problem:
 -but the copy rate is only (0.1MBps)
 -10MBps bandwidth between using scp

May be the bandwidth affect the copy rate.
Can any one help us.

2007/8/4, Joydeep Sen Sarma <jssarma@facebook.com>:
>
> I have a fairly simple job with a map, a local combiner and a reduce.
> The combiner and the reduce do the equivalent of a group_concat (mysql).
>
>
> I have horrible performance in the reduce stage:
> - the map jobs are done
> - all the reduce jobs claim they are copying data - but the copy rate is
> abysmal (0.5MBps)
>   - checked the network topology - everything's on GigE and on same
> switch. (80 machine cluster)
>   - seeing 50+ MBps bandwidth between any pair using scp
> - when I look at the machines where reduce is running - vmstat says 0%
> cpu util.
>
> A sample reducetask log is below. Job conf: 64 way reduce. I specified
> the map tasks to the same number - but hadoop is anyway creating 386 map
> tasks.
>
> Anyone has some quick hints on what could be going wrong?
>
> Thanks,
>
> Joydeep
>
> 2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message