hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miles Osborne" <mi...@inf.ed.ac.uk>
Subject Re: How can reducers start before the mappers have finished?
Date Mon, 03 Mar 2008 19:28:56 GMT
Reducers can copy the Mapper output prior to actual reducing  (if you look
at the GUI, you will see "copy", "sort" and actual reducing)

MIles

On 03/03/2008, Marc Harris <mharris@jumptap.com> wrote:
>
> I noticed when reading http://wiki.apache.org/hadoop/HardwareBenchmarks
> the following comment:
>
> "I ran into some odd behavior on Herd2 where if i [ . . . ] the reducers
> don't start until the mappers finish, slowing the job significantly."
>
> This puzzled me. I don't see how reducers can ever start before the
> mappers have finished. I thought that any given call to a reducer will
> supply all the (key,value) pairs for a given value of the key. How can a
> reducer start until all the different values for a key are known? And
> thus how can a reducer start before all the mappers have finished?
>
>
> - Marc
>
>


-- 
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message