hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Virajith Jalaparti <virajit...@gmail.com>
Subject Re: How is reduce completion % calculated?
Date Wed, 08 Jun 2011 14:31:44 GMT
Sean, can you point me to the file where the exact calculation  of %
progress of the Map/Reduce phase takes place? I have been trying to find it
following the {Task/TaskTracker/TaskInProgress/Progress/JobInProgress}.java
files but was just able to find the phase-vice division in the Progress.java
file in the util directory.

Thanks a lot,
Virajith

On Wed, Jun 8, 2011 at 3:27 PM, Sean Owen <srowen@gmail.com> wrote:

> Exactly, the reducer will show it's in the "copy" phase here which is
> exactly what it can do before the mappers have finished.
>
> It's not true that single reducer completion can only be 0, 0.33, 0.67, 1.0
> -- of course it makes progress through a copy, sort, shuffle, reduce by
> chunk, by records, so can report much smaller quanta of progress than that.
>
>
> On Wed, Jun 8, 2011 at 3:19 PM, John Armstrong <john.armstrong@ccri.com>wrote:
>
>> On Wed, 8 Jun 2011 15:09:41 +0100, Virajith Jalaparti
>> <virajith.j@gmail.com> wrote:
>> > I was looking at the syslog generated by my job run and it looks like
>> the
>> > reducers start before the mappers complete. I figured this was the case
>> > because even when the Map had <100% completion, the reduce completion %
>> was
>> > greater than 0.
>>
>> This is true; as mappers complete they start delivering their output to
>> reducers, which can start their "sort" phase.  What you're seeing is
>> reducers completing some portion of their sort phase on the completed
>> mapper output.
>>
>
>

Mime
View raw message