hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prasenjit mukherjee <pmukher...@quattrowireless.com>
Subject Re: do all mappers finish before reducer starts
Date Wed, 27 Jan 2010 03:00:32 GMT
For algebraic reduce functions, it should be able to  parallally start user
reduce functions (3)  as well even before the mapper completes, right ?

On Wed, Jan 27, 2010 at 4:19 AM, Ed Mazur <mazur@cs.umass.edu> wrote:

> You're right that the user reduce function cannot be applied until all
> maps have completed. The values being reported about job completion
> are a bit misleading in this sense. The reduce percentage you're
> seeing actually encompasses three parts:
>
> 1. Fetching map output data
> 2. Merging map output data
> 3. Applying the user reduce function
>
> Only the third part has the constraint of waiting for all maps; the
> other two can be done in parallel, hence the reduce percentage
> increasing before map completes. 0-33% reduce corresponds to step 1,
> 33-67% to step 2, and 67-100% to step 3. There is overlap between
> parts 1 and 2 as the reduce memory buffer fills up, merges, and spills
> to disk. There is also overlap between parts 2 and 3 because the final
> merge is fed directly into the user reduce function to minimize the
> amount of data written to disk.
>
> Ed
>
> On Tue, Jan 26, 2010 at 5:27 PM, adeelmahmood <adeelmahmood@gmail.com>
> wrote:
> >
> > I just have a conceptual question. My understanding is that all the
> mappers
> > have to complete their job for the reducers to start working because
> mappers
> > dont know about each other so we need values for a given key from all the
> > different mappers so we have to wait until all mappers have collectively
> > given the system all possible values for a key .so that then that can be
> > passed on the reducer ..
> > but when I ran these jobs .. almost everytime before the mappers are all
> > done the reducers start working .. so it would say map 60% reduce 30% ..
> how
> > does this works
> > Does it finds all possibly values for a single key from all mappers ..
> pass
> > that on the reducer and then works on other keys
> > any help is appreciated
> > --
> > View this message in context:
> http://old.nabble.com/do-all-mappers-finish-before-reducer-starts-tp27330927p27330927.html
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message