hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Why does Hadoop need ssh access to master and slaves?
Date Wed, 21 Jan 2009 12:58:53 GMT
Amit k. Saha wrote:
> On Wed, Jan 21, 2009 at 5:53 PM, Matthias Scherer
> <Matthias.Scherer@1und1.de> wrote:
>> Hi all,
>> we've made our first steps in evaluating hadoop. The setup of 2 VMs as a
>> hadoop grid was very easy and works fine.
>> Now our operations team wonders why hadoop has to be able to connect to
>> the master and slaves via password-less ssh?! Can anyone give us an
>> answer to this question?
> 1. There has to be a way to connect to the remote hosts- slaves and a
> secondary master, and SSH is the secure way to do it
> 2. It has to be password-less to enable automatic logins

SSH is *a * secure way to do it, but not the only way. Other management 
tools can bring up hadoop clusters. Hadoop ships with scripted support 
for SSH as it is standard with Linux distros and generally the best way 
to bring up a remote console.

Your ops team should not be worrying about the SSH security, as long as 
they keep their keys under control.

(a) Key-based SSH is more secure than passworded SSH, as man-in-middle 
attacks are prevented. passphrase protected SSH keys on external USB 
keys even better.

(b) once the cluster is up, that filesystem is pretty vulnerable to 
anything on the LAN. You do need to lock down your datacentre, or set up 
the firewall/routing of the servers so that only trusted hosts can talk 
to the FS. SSH becomes a detail at that point.

View raw message