hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Map Reduce in heterogeneous environ..
Date Thu, 11 Mar 2010 12:25:22 GMT
abhishek sharma wrote:
>> No. of slots per task tracker cannot be varied so even if some nodes
>> have additional cores, extra slots cannot be added.
> True. This is what I have been wishing for;-) I routinely use clusters
> where some machines have 8 while others have 4 cores.

Varying the #of task slots per node is trivial. Every TT reports the #of 
avaialable slots. Therefore you need a separate config file for every 
class of node in your cluster, set the 
mapred.tasktracker.map.tasks.maximum and 
mapred.tasktracker.reduce.tasks.maximum values to the limits for those 
machines, push out the right config file to the right target machines.

If you don't have a way of providing different configurations to 
different machines in your cluster, the problem lies with your 
configuration management tooling/policy, not Hadoop.

What we dont have (today) is the ability of a live TT to vary its slots 
based on other system information, so if the machine is also accepting 
workloads from some grid scheduler the TT can't look at the number of 
live grid jobs or the IO load and use that to reduce its slot count. 
Contributions there would be welcomed by those people that share compute 
nodes on different workloads.


View raw message