hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Minho Kim <minwise....@samsung.com>
Subject RE: General questions
Date Fri, 28 Aug 2015 04:47:26 GMT
Hi Behroz,

I would response to your questions.

*1- Password-Less SSH between machines of cluster*
>> Scripts such as Bash, Python, and so on, would be useful for managing many servers.
So I recommend using script files to manage cluster like installing hama and modifying configuration.
This script help you make script file for managing it. Please refer to following link.

https://github.com/awslabs/emr-bootstrap-actions/tree/master/hama

*2- Dynamic IPs*
*3- HDFS data*
>> In fact, I have never used clusters allocated dynamic IP. But the answer to both
questions is using a DNS server. If you use DNS server, you can set configuration using the
host name which is an alias assigned to an IP.


Best Regards,
Minho Kim

-----Original Message-----
From: Behroz Sikander [mailto:behroz89@gmail.com] 
Sent: Friday, August 28, 2015 11:43 AM
To: user@hama.apache.org
Subject: General questions

Hi,
Since, I am pretty new to cluster configurations, I need some suggestions on how to solve
below mentioned problems efficiently.

*1- Password-Less SSH between machines of cluster* To have a working cluster, we need password-less
SSH access between all the machines in cluster. Till now, I was manually doing them because
I had only
3 machines. I am moving to 20 machines now. So, it is a lot of work. How teams who manage
100s of servers solve this problem ? Bash scripts ?

*2- Dynamic IPs*
This is the biggest problem. Every time I restart my cluster, all the machines get new IPs.
It means that I need to modify my /etc/hosts files on all machines. Also I need to verify
my password-less SSH logins. Getting a static IP is difficult in my current setup. So, how
people solve this problem ?

*3- HDFS data*
As per my current understanding, HDFS has namenodes and datanodes.
Namenodes contain the all the information about chunks and where they are placed. Now, lets
assume I restarted my cluster and got new IPs. My whole HDFS data will be messed up. Again
how to solve this problem ?

*4- Monitoring*
Hama provides a Web GUI to check the basic information about the job. But a few things seem
to be missing like bandwidth, cpu and memory usage on cluster and individual machine level.
 Are there any third party tools that can be integrated in cluster to monitor Hama ? (AMBARI
maybe ?)

Regards,
Behroz


Mime
View raw message