hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Kamat <ama...@yahoo-inc.com>
Subject Re: do maps explicitly finish before reduces begin in hadoop
Date Mon, 03 Mar 2008 15:11:15 GMT
In case of HADOOP the reducers can start along with maps because the 
shuffle phase can start and pull map outputs whenever it can. This 
overlaps the map phase and shuffle phase. The actual reduce happens only after
all the maps have completed and the map output meant for the reduce is 
sorted. So even in case of HADOOP the reduce function is applied only 
after all the maps finish. But the reducers start in parallel just for 
shuffling.
Amar
On Mon, 3 Mar 2008, momina khan wrote:

> hi all,
>
> as seen in the video lectures from google their map reduce ensures
> that all maps finish before reduces begin ... their reason for
> ensuring this is that not all reduce functions are not necessarily
> idempotent....
> i just wanted to confirm whether hadoop too follows the same
> philosophy ? do all maps end and then reduces begin or can they go on
> in parallel cause that is the impression you get from the hadoop code!
>
> cheers
> momina
>

Mime
View raw message