hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pratap M <mc.pra...@gmail.com>
Subject Partitioner Can any one please clarify my question?
Date Fri, 14 Mar 2014 09:34:54 GMT
Hi,

I understand that the mapper produces 1 partition per reducer. How does the
reducer know which partition to copy? Lets say there are 2 nodes running
mapper for word count program and there are 2 reducers configured. If each
map node produces 2 partitions, with the possibility of partitions in both
the nodes containing same word as key, how will the reducer work correctly?

For ex:

If node 1 produces partition 1 and partition 2, and partition 1 contains a
key named "WHO".

If node 2 produces partition 3 and partition 4, and partition 3 contains a
key named "WHO".

If Partition 1 and Partition 4 went to reducer 1 (and remaining to reducer
2), how does the reducer 1 compute the correct word count?

If this is not a possibility, and partition 1 and 3 would be made to go to
reducer 1, how Hadoop does this? Does it make sure a given key-value pair
from different nodes always go to a same reducer? If so, how it does this?

Mime
View raw message