hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From java8964 <java8...@hotmail.com>
Subject RE: issue about Shuffled Maps in MR job summary
Date Wed, 11 Dec 2013 13:50:33 GMT
The whole job complete time depends on a lot of factors. Are you sure the reducers part is
the bottleneck?
Also, it also depends on how many Reducer input groups it has in your MR job. If you only
have 20 reducer groups, even you jump your reducer count to 40, then the epoch of reducers
part won't have too much change, as the additional 20 reducer task won't get data to process.
If you have a lot of reducer input groups, and your cluster does have capacity at this time,
and your also have a lot idle reducer slot, then increase your reducer count should decrease
your whole job complete time.
Make sense?

Date: Wed, 11 Dec 2013 14:20:24 +0800
Subject: Re: issue about Shuffled Maps in MR job summary
From: justlooks@gmail.com
To: user@hadoop.apache.org

i read the doc, and find if i have 8 reducer ,a map task will output 8 partition ,each partition
will be send to a different reducer, so if i increase reduce number ,the partition number
increase ,but the volume on network traffic is same,why sometime ,increase reducer number
will not decrease job complete time ?

On Wed, Dec 11, 2013 at 1:48 PM, Vinayakumar B <vinayakumar.b@huawei.com> wrote:

It looks simple, J

Shuffled Maps= Number of Map Tasks * Number of Reducers

Thanks and Regards,
Vinayakumar B

From: ch huang [mailto:justlooks@gmail.com] 

Sent: 11 December 2013 10:56
To: user@hadoop.apache.org
Subject: issue about Shuffled Maps in MR job summary



           i run terasort with 16 reducers and 8 reducers,when i double reducer number, the
Shuffled maps is also double ,my question is the job only run 20 map tasks (total input file
is 10,and each file is 100M,my block size is 64M,so split is 20) why i need shuffle 160 maps
in 8 reducers run and 320 maps in 16 reducers run?how to caculate the shuffle maps number?


16 reducer summary output:



 Shuffled Maps =320


8 reducer summary output:


Shuffled Maps =160
View raw message