hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: AW: Why does Hadoop need ssh access to master and slaves?
Date Wed, 21 Jan 2009 13:36:36 GMT
Matthias Scherer wrote:
> Hi Steve and Amit,
> Thanks for your answers. I agree with you that key-based ssh is nothing to worry about.
But I'm wondering what exactly - that means wich grid administration tasks - hadoop does via
ssh?! Does it restart crashed data nodes or tasks trackers on the slaves? Oder does it transfer
data over the grid with ssh access? How can I find a short description what exactly hadoop
needs ssh for? The documentation says only that I have to configure it.
> Thanks & Regards
> Matthias

SSH is used by the various scripts in bin/ to start and stop clusters, 
slaves.sh does the work, the other ones (like hadoop-daemons.sh) use it 
to run stuff on the machines.

The EC2 scripts use SSH to talk to the machines brought up there; when 
you ask amazon for machines, you give it a public key to be set to the 
allowed keys list of root; you use that to ssh in and run code.

There is currently no liveness/restarting built into the scripts; you 
need other things to do that. I am working on this, with  HADOOP-3628, 

I will be showing some other management options at ApacheCon EU 2009, 
which being on the same continent and timezone is something you may want 
to consider attending; lots of Hadoop people will be there, with some 
all-day sessions on it.

One big problem with cluster management is not just recognising failed 
nodes, it's handling them. The actions you take are different with a 
VM-cluster like EC2 (fix: reboot, then kill that AMI and create a new 
one), from that of a VM-ware/Xen-managed cluster, to that of physical 
systems (Y!: phone Allen, us: email paolo). Once we have the health 
monitoring in there different people will need to apply their own policies.


Steve Loughran                  http://www.1060.org/blogxter/publish/5

View raw message