hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: Best practices - Large Hadoop Cluster
Date Tue, 10 Aug 2010 18:35:34 GMT
I think Raj's question was about best practices, not how to do it.  Best practice is definitely
*not* to manage configurations one by one.  This is not a Hadoop question, it's a "how do
I manage a lot of computers" question.

Best practices is some combination of:
1) Automated way of installing OS / software on all nodes according to a given profile (cobbler
/ ROCKS / perceus are ways of doing RHEL-variants), then run Hadoop as a system service. 
This guarantees you are able to replicate the configuration of a system.
2) Use puppet or cfEngine to enforce system configuration properties (such as running daemons).
 This is sometimes more useful for rapidly changing environments.

While it is possible to manage things using passwordless ssh-based commands (in fact, ROCKS
will automatically do a nice passwordless ssh setup for you), this is often a few steps from
disaster.  It's all too easy to make undocumented changes - and you don't know how many undocumented
changes there are until your sysadmin leaves and it becomes a disaster.

So - best practices for the large scale for running service X is not using ssh.  It is to
use the service management techniques provided by your operating system or some other service
management tool accepted by your organization (for example, SmartFrog from HP Labs goes above
and beyond Linux's somewhat antiquated system).  This statement does not change if X="Hadoop".


On Aug 10, 2010, at 1:13 PM, Gokulakannan M wrote:

> Hi Raj,
> 	As per my understanding the problem is with ssh password each time
> you start/stop the cluster. You need password less startup shutdown right.?
> 	Here is my way of overcoming the ssh problem 
> 	Write a shell script as follows:
> 	1. Generate a ssh key from the namenode machine (where you will
> start/stop the cluster)
> 	2. Read each entry from the conf/slaves file and do the following
> 		2.1 add the key you generated in step 1 to the ssh
> authorized_keys file of the datanode machine that you got in step 2
> something like below script
> 			cat $HOME/.ssh/public_key_file | ssh username@host '
> cat >> $HOME/.ssh/authorized_keys'
> 	3. Repeat step 2 for conf/masters also
> 	Note: Password must be specified for the specified username@host
> first time since the ssh command given in point 2.1 requires it. 
> 	Now you can start/stop your hadoop cluster without ssh password
> overhead
> Thanks,
>  Gokul
> ****************************************************************************
> ***********
> -----Original Message-----
> From: Raj V [mailto:rajvish@yahoo.com] 
> Sent: Tuesday, August 10, 2010 7:16 PM
> To: common-user@hadoop.apache.org
> Subject: Best practices - Large Hadoop Cluster
> I need to start setting up a large - hadoop cluster of 512 nodes . My
> biggest 
> problem is the SSH keys. Is there a simpler way of generating and exchanging
> ssh 
> keys among the nodes? Any best practices? If there is none, I could
> volunteer to 
> do it,
> Raj

View raw message