hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: Best practices - Large Hadoop Cluster
Date Tue, 10 Aug 2010 23:55:28 GMT

Raj...

Ok, one of the things we have at one of my clients is the hadoop user's account is actually
a centralized account. (User's accounts are mounted as they log in to the machine.)
So you have a single account hadoop for all of the machines.

So when you set up the keys, they are in the ~hadoop account.

So you have a bit of work w 512 nodes, and yeah, its painful for the first time.

Like I said, I don't have a cloud of 512 nodes, and when I am building the cloud of 20+ machines,
setting up ssh is just part of the process.

If you set up hadoop as a system service, then does that mean when you boot the machine, your
node goes up on its own like other services? 
I personally don't think that's a good idea...

I haven't evaluated puppet, I'm pulled yet again in to other things....

So I don't have an answer.

My point was that you ca go through and add the user/password keys as part of the build process
and while painful, its not that painful. (Trust me, there's worse things that can get dropped
on your desk. ;-)

-Mike


> Date: Tue, 10 Aug 2010 13:06:51 -0700
> From: rajvish@yahoo.com
> Subject: Re: Best practices - Large Hadoop Cluster
> To: common-user@hadoop.apache.org
> 
> Mike
> 512 nodes, even a minute for each node ( ssh-ing to each node, typing a 8 
> character password, ensuring that everything looks ok) is about 8.5 hours. After 
> that if something does not work, that is a different level of pain altogether. 
> 
> Using scp to exchange keys simply does not scale.
> 
> My question was simple, how do other people in the group who run large clusters 
> manage this?  Brian put it better; Whats is the best, duplicatable  way of 
> running hadoop  when the cluster is large. I agree, this is not a hadoop 
> question per se, but hadoop is really what I care about now.
> 
> Thanks to others for useful suggestions. I will examine them and post a summary 
> if anyone is interested.
> 
> Raj
> 
> 
> 
> 
> 
> ________________________________
> From: Michael Segel <michael_segel@hotmail.com>
> To: common-user@hadoop.apache.org
> Sent: Tue, August 10, 2010 11:36:14 AM
> Subject: RE: Best practices - Large Hadoop Cluster
> 
> 
> I'm a little confused by Raj's problem.
> 
> If you follow the instructions outlined in the Hadoop books and everywhere else 
> about setting up ssh keys, you shouldn't have a problem.
> I'd just ssh as the hadoop user to each of the nodes before trying to start 
> hadoop for the first time.
> 
> At 512 nodes, I think you may run in to other issues... (I don't know, I don't 
> have 512 machines to play with :-(  ) And puppet has been recommended a couple 
> of times.
> 
> Just my $0.02
> 
> -Mike
> 
> 
> > Date: Tue, 10 Aug 2010 23:43:12 +0530
> > From: gokulm@huawei.com
> > Subject: RE: Best practices - Large Hadoop Cluster
> > To: common-user@hadoop.apache.org
> > 
> > 
> > Hi Raj,
> > 
> >     As per my understanding the problem is with ssh password each time
> > you start/stop the cluster. You need password less startup shutdown right.?
> > 
> >     Here is my way of overcoming the ssh problem 
> > 
> >     Write a shell script as follows:
> >     
> >     1. Generate a ssh key from the namenode machine (where you will
> > start/stop the cluster)
> > 
> >     2. Read each entry from the conf/slaves file and do the following
> >     
> >         2.1 add the key you generated in step 1 to the ssh
> > authorized_keys file of the datanode machine that you got in step 2
> > something like below script
> >             cat $HOME/.ssh/public_key_file | ssh username@host '
> > cat >> $HOME/.ssh/authorized_keys'
> > 
> > 
> >     3. Repeat step 2 for conf/masters also
> > 
> >     Note: Password must be specified for the specified username@host
> > first time since the ssh command given in point 2.1 requires it. 
> >         
> >     Now you can start/stop your hadoop cluster without ssh password
> > overhead
> > 
> > 
> >  Thanks,
> >   Gokul
> >  
> >    
> >  
> > ****************************************************************************
> > ***********
> > 
> > -----Original Message-----
> > From: Raj V [mailto:rajvish@yahoo.com] 
> > Sent: Tuesday, August 10, 2010 7:16 PM
> > To: common-user@hadoop.apache.org
> > Subject: Best practices - Large Hadoop Cluster
> > 
> > I need to start setting up a large - hadoop cluster of 512 nodes . My
> > biggest 
> > problem is the SSH keys. Is there a simpler way of generating and exchanging
> > ssh 
> > keys among the nodes? Any best practices? If there is none, I could
> > volunteer to 
> > do it,
> > 
> > Raj
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message