hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Hadoop cluster/monitoring
Date Sun, 12 Aug 2012 14:18:42 GMT

On Wed, Aug 8, 2012 at 10:52 PM, Nagaraju Bingi
<nagaraju_bingi@persistent.co.in> wrote:
> Hi,
> I'm beginner in Hadoop concepts. I have few basic questions:
> 1) looking for APIs to retrieve the capacity of the cluster. so that i can write a script
to when to add a new slave node to the cluster
>              a) No.of Task trackers and  capacity of  each task tracker  to spawn  max
No.of Mappers

For this, see: http://hadoop.apache.org/common/docs/stable/api/org/apache/hadoop/mapred/ClusterStatus.html

>               b) CPU,RAM and disk capacity of each tracker

Rely on other tools to provide this one. Tools such as Ganglia and
Nagios can report this, for instance.

>               c) how to decide to add a new  slave node to the cluster

This is highly dependent on the workload that is required out of your clusters.

>  2) what is the API to retrieve metrics like current usage of resources and currently
running/spawned Mappers/Reducers

See 1.a. for some, and 1.b for some more.

>  3) what is the purpose of Hadoop-common?Is it API to interact with hadoop

Hadoop Common encapsulates the utilities shared by both of the other
sub-projects - MapReduce and HDFS. Among other things, it does provide
a general interaction API for all things 'Hadoop'

Harsh J

View raw message