hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "momina khan" <momina.a...@gmail.com>
Subject thanks :) Re: Shuffle phase
Date Mon, 03 Mar 2008 10:42:05 GMT
thanks :)

On Mon, Mar 3, 2008 at 5:23 AM, Owen O'Malley <oom@yahoo-inc.com> wrote:
>
>
>  On Mar 2, 2008, at 12:53 PM, momina khan wrote:
>
>  > i have trouble comprehending what shuffle phase is exactly ... can
>  > anyone plz exlpain in for me.... and also point out the name of the
>  > class that the thread for shuffle runs and also the class spawning the
>  > thread itself!
>
>  The shuffle phase is the data motion from the map output to the
>  reduce input. In general, it involves each reduce collecting outputs
>  from each map, which is why it is called the "shuffle". The
>  TaskTracker where the map ran has a jetty server that gives out the
>  map outputs. The ReduceTask copies the map outputs as they finish.
>  You can look at ReduceTask.java for the client side of the shuffle.
>
>  -- Owen
>

Mime
View raw message