hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7417) Hadoop Management System (Umbrella)
Date Sat, 25 Jun 2011 20:31:47 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054963#comment-13054963

Eric Yang commented on HADOOP-7417:

Allen, I did extensive studies on all existing systems including puppet, mcollective, chef,
cfengine, controlTier, Bcfg2.  Most of the configuration management system focus on generating
a set of templates and config parameters and push out changes one node at a time.  This works
fine in small number of machines, but most of the system fails beyond 1800 nodes or become
difficult to maintain.  i.e. mcollective uses spamming tree model on puppeteer, hence the
puppet master becomes single point of failure.  One puppet master failure could take large
chunk of the nodes offline.  HMS is designed to remove single point of failures in the deployment
system, and improve performance.  we found it is more reliable to store system state in Zookeeper
for HA.  Zeroconf is great for resolving service location.  Exist config management system
requires installation and configuration before it can be deployed.  HMS is designed to install
and operate without having to configure the management system.  Bittorrent is much faster
than install software from yum repository for large scale system.  Granted that this system
started several years behind existing system, but it solves some scalability and reliability

To summarize, HMS does the following better:

- Scale
- Reliability
- Cross node application orchestration (action dependencies)
- Speed
- Sophisticate monitoring system (Reuse Chukwa)
- Self healing cluster (Ability to replay history to heal nodes)

> Hadoop Management System (Umbrella)
> -----------------------------------
>                 Key: HADOOP-7417
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7417
>             Project: Hadoop Common
>          Issue Type: New Feature
>         Environment: Java 6, Linux
>            Reporter: Eric Yang
>            Assignee: Eric Yang
> The primary goal of Hadoop Management System is to build a component around management
and deployment of Hadoop related projects. This includes software installation, configuration,
application orchestration, deployment automation and monitoring Hadoop.
> Prototype demo source code can be obtained from:
> http://github.com/macroadster/hms

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message