hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Parks" <davidpark...@yahoo.com>
Subject RE: About configuring cluster setup
Date Wed, 15 May 2013 07:50:18 GMT
We have a box that's a bit overpowered for just running our namenode and
jobtracker on a 10-node cluster and we also wanted to make use of the
storage and processor resources of that node, like you.


What we did is use LXC containers to segregate the different processes. LXC
is a very light weight psudo-virtualization platform for linux (near 0


The key benefit to LXC, in this case, is that we can use linux cgroups
(standard, simple config in LXC) to specify that the container/VM running
the namenode/jobtracker should have 10x the CPU and IO resources than the
container that runs a tasktracker/data node (though since LXC containers all
run under the same kernel, any "unused" resources are assigned to runnable


We run cloudera hadoop and deployed a slightly modified tasktracker
configuration on the shared box (fewer task slots so as to not over utilize


That tasktracker doesn't do as much work as the other dedicated nodes, but
it does a fair share, and the cgroup configurations (cpu.shares &
blkio.weight for the curious) ensure that the bulk processing doesn't
interfere with the critical namenode & jobtracker systems.



From: Robert Dyer [mailto:psybers@gmail.com] 
Sent: Tuesday, May 14, 2013 11:23 PM
To: user@hadoop.apache.org
Subject: Re: About configuring cluster setup


You can, however note that unless you also run a TaskTracker on that node
(bad idea) then any blocks that are replicated to this node won't be
available as input to MapReduces and you are lowering the odds of having
data locality on those blocks.


On Tue, May 14, 2013 at 2:01 AM, Ramya S <ramyas@suntecgroup.com> wrote:



Can we configure 1 node as both Name node and Data node ?

View raw message