hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: Shuffle phase replication factor
Date Wed, 22 May 2013 14:37:50 GMT
As mentioned by Bertrand, Hadoop, The Definitive Guide, is well... really
definitive :) place to start. It is pretty thorough for starts and once you
are gone through it, the code will start making more sense too.

Regards,
Shahab


On Wed, May 22, 2013 at 10:33 AM, John Lilley <john.lilley@redpoint.net>wrote:

>  Oh I see.  Does this mean there is another service and TCP listen port
> for this purpose?****
>
> Thanks for your indulgence… I would really like to read more about this
> without bothering the group but not sure where to start to learn these
> internals other than the code.****
>
> john****
>
> ** **
>
> *From:* Kai Voigt [mailto:k@123.org]
> *Sent:* Tuesday, May 21, 2013 12:59 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Shuffle phase replication factor****
>
> ** **
>
> The map output doesn't get written to HDFS. The map task writes its output
> to its local disk, the reduce tasks will pull the data through HTTP for
> further processing.****
>
> ** **
>
> Am 21.05.2013 um 19:57 schrieb John Lilley <john.lilley@redpoint.net>:****
>
>
>
> ****
>
> When MapReduce enters “shuffle” to partition the tuples, I am assuming
> that it writes intermediate data to HDFS.  What replication factor is used
> for those temporary files?****
>
> john****
>
>  ****
>
> ** **
>
> -- ****
>
> Kai Voigt****
>
> k@123.org****
>
> ** **
>
>
>
> ****
>
> ** **
>

Mime
View raw message