hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vamshi Krishna <vamshi2...@gmail.com>
Subject basic doubt on number of reduce tasks
Date Fri, 02 Mar 2012 10:09:55 GMT
Hi all,
Consider in hadoop cluster having 4 nodes, and in every node the maximum
no.of reduce slots fixed at 5. When mapreduce deamons started,

1) Is there any restriction on no. of simultaneously running reduce tasks
on all nodes such as it should be same on all nodes? OR

2)Is it like this: A node where there is lot of data to be processed, on
that node higher number of reduce tasks will run than the node where less
amount of data present.That is, according to the size of data to be
processed on a particular node, proportionate number of reduce tasks will
be run on different nodes.

please some body clarify this basic doubt .. which is correct? If none,
what is the actual process that takes place

Vamshi Krishna

View raw message