hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ch huang <justlo...@gmail.com>
Subject Re: issue about Shuffled Maps in MR job summary
Date Wed, 11 Dec 2013 06:20:24 GMT
i read the doc, and find if i have 8 reducer ,a map task will output 8
partition ,each partition will be send to a different reducer, so if i
increase reduce number ,the partition number increase ,but the volume on
network traffic is same,why sometime ,increase reducer number will not
decrease job complete time ?

On Wed, Dec 11, 2013 at 1:48 PM, Vinayakumar B <vinayakumar.b@huawei.com>wrote:

>  It looks simple, J
>
>
>
> Shuffled Maps= Number of Map Tasks * Number of Reducers
>
>
>
> Thanks and Regards,
>
> Vinayakumar B
>
>
>
> *From:* ch huang [mailto:justlooks@gmail.com]
> *Sent:* 11 December 2013 10:56
> *To:* user@hadoop.apache.org
> *Subject:* issue about Shuffled Maps in MR job summary
>
>
>
> hi,maillist:
>
>            i run terasort with 16 reducers and 8 reducers,when i double
> reducer number, the Shuffled maps is also double ,my question is the job
> only run 20 map tasks (total input file is 10,and each file is 100M,my
> block size is 64M,so split is 20) why i need shuffle 160 maps in 8 reducers
> run and 320 maps in 16 reducers run?how to caculate the shuffle maps number?
>
>
>
> 16 reducer summary output:
>
>
>
>
>
>  Shuffled Maps =320
>
>
>
> 8 reducer summary output:
>
>
>
> Shuffled Maps =160
>

Mime
View raw message