hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Friso van Vollenhoven <fvanvollenho...@xebia.com>
Subject Re: reducers run past 100% (does that problem still exist?)
Date Tue, 22 Jun 2010 10:20:22 GMT
Hi Ravi,

The reducers are all in the reduce phase (so copy and sort finished by then). We do use compression
on the mapper output, but I thought that the issues relating to that are fixed in the 0.20.2
release. Can you (or anyone) confirm there is still such a bug?


On Jun 22, 2010, at 11:57 AM, Ravi Gummadi wrote:

> Reduce task has 3 phases: copy phase, sort phase and reduce phase. Each phase will correspond
to 33.33% of the total reduce task's progress. Which phase was your reducer in when you saw
the progress > 100%(You could see the phase on the web UI in the column "state" after the
">" symbol) ?
> If you see progress > 66.7% while the task is in sort phase, then the problem could
be in merge progress calculation, which is already fixed in HADOOP-5210.  Hadoop version 0.20.2
should already contain the fix of HADOOP-5210.
> Otherwise, if the 3rd phase of reduce task(reduce phase) is started with 66.7% only and
then if progress goes beyond 100%, then may be the bug(in hadoop) is because of not calculating
progress correctly for the case of "compressed input to reducer".
> -Ravi
> Friso van Vollenhoven wrote:
>> Hi all,
>> When I run long running map/reduce jobs the reducers run past 100% before reaching
completion. Sometimes as far as up to 140%. I have searched the mailing list and other resources
and noticed bug reports related to this when using map output compression, but all appear
to be fixed by now.
>> The job I am running reads sequence files from HDFS and in the reducer inserts records
into HBase. The reducer has NullWritable as both output key and output value.
>> Some additional info:
>> - the job takes in total close to 60 hours to complete
>> - there are 10 reducers
>> - the map output is compressed using the default codec and block compression
>> - speculative execution is turned off (otherwise we could be hitting HBase harder
than necessary)
>> - mapred.job.reuse.jvm.num.tasks = 1
>> - io.sort.factor = 100
>> - io.sort.record.percent = 0.3
>> - io.sort.spill.percent = 0.9
>> - mapred.inmem.merge.threshold = 100
>> - mapred.job.reduce.input.buffer.percent = 1.0
>> I am using Hadoop 0.20.2 on a small cluster (1x NN+JT, 4x DN+TT).
>> Does anyone have a clue? Or can anyone tell me how the progress info for reducers
is calculated? Any help is appreciated.
>> Regards,
>> Friso

View raw message