incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Incubator Wiki] Update of "HMSProposal" by EricYang
Date Thu, 18 Aug 2011 17:55:06 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "HMSProposal" page has been changed by EricYang:

  == Abstract ==
- HMS is a software deployment system for Hadoop Ecosystem.
+ HMS is a lifecycle management project for Apache Hadoop clusters.
  == Proposal ==
- HMS will develop a open source deployment system for large distributed system.  The primary
goal of HMS is to build a community around management and deployment of Hadoop related projects.
 This includes software installation, configuration, application orchestration and deployment
+ We are proposing an Apache project to build a tool that would vastly simplify the process
of deployment and configuration of Hadoop stack on a cluster. Hadoop stack comprises of various
software components in the Hadoop ecosystem e.g. HDFS, MapReduce, HBase, Hive, HCatalog, Pig,
Zookeeper and Oozie. Our plan is to support the Hadoop stack as a unit of deployment and configuration
where only certain pre-tested versions of software components are supported to be part of
Hadoop stack. Administrators can always enable/disable the individual software components
from the Hadoop stack per their deployment needs.
+ The main use cases that HMS is trying to address are the following:
+  * Hadoop stack deployment and upgrades 
+  * Hadoop services configuration & management
+   * Declarative configuration (no scripts required)
+  * Administration of Hadoop services
+   * Includes starting and stopping services
+   * System maintenance tasks, such as fsck, format, re-balance, and compaction
+  * User access & quota management on Hadoop clusters
+  * Easily check and be alerted to failures in Hadoop servers
+  * Automated discovery of new machines that become available
+  * Expanding and contracting Hadoop clusters
+  * Automatic resynchronization to ‘desired’ state to handle faulty nodes
+  * Handle node burn-ins
+  * In the future, possibly allow for customized monitoring dashboards
+  * Dynamic configuration - Hadoop configuration deduced from machine attributes (e.g., RAM,
CPU, Disk)
+  * Operational monitoring for Hadoop clusters
+ HMS is targeted to administrators responsible for managing Hadoop clusters. HMS leverages
existing data center management and monitoring infrastructure - Nagios, LDAP, Kerberos, etc.
+ For the bare metal provisioning, the cluster admins continue to use their  existing infrastructure.
Provisioning a machine from scratch is not in the scope of the current roadmap.
  == Background ==
- HMS was developed by Yahoo Inc headed By Jagane Sunder, Sunnyvale in 2011.  HMS was designed
as a reference implementation for automating Hadoop cluster deployment.
+ Hadoop’s ecosystem includes many projects (HDFS, MapReduce, Pig, HBase, etc.). In many
cases, users and operators typically want to deploy a combination of some projects as a stack.
It takes a significant amount of time to get a properly configured Hadoop cluster up and running.
HMS has been designed to solve that problem. HMS automates the whole process of deploying
a stack.
+ HMS is being developed by developers employed with Yahoo!, Hortonworks and IBM. Such a tool
would have a large number of users and increase the adoption of Apache Hadoop’s ecosystem.
We are therefore proposing to make HMS Apache open source.
  == Rationale ==
+ The reasons for having a tool like HMS have been explained above. Having HMS as an Apache
Open Source project will highly benefit it from the point of view of getting a large community
that currently uses Hadoop and the other products built around Hadoop (like Pig, Hive, etc.).
Users of the Hadoop ecosystem can influence HMS’s roadmap, and contribute to it. Looking
at it in another way, we believe having HMS as part of the Hadoop ecosystem will be a great
benefit to the current Hadoop ecosystem too.
- The maintainers and developers of HMS are interested in joining the Apache Software Foundation
top level project for several reasons:
-  * Apache provide a great community for open source software development environment.
-  * It might open the door for sharing ideas or cooperation with other Apache projects, such
as Hadoop and HBase.
-  * HMS would like to benefit from Apache's development infrastructure.
- == Initial Goals ==
- Though the bulk of HMS initial development is complete and the framework is running stable,
there are still some large areas for future development. Some area we hope to focus on in
-  * Define interface for software upgrade
-  * Refine server failure recovery
  == Current Status ==
- HMS is currently a reference implementation of deploy Hadoop on large scale clusters.  Th
majority of the team was employed in Yahoo.  One major contributor has moved to IBM.
- This change does not immediately affect HMS because the people who were in Yahoo still remain
active contributors to HMS.  The project continues to be supported and actively enhanced.
 There is now the opportunity to become an open source project without a single large organization
  === Meritocracy ===
- The initial developers are very familiar with meritocratic open source development, both
at Apache and elsewhere. Apache was chosen specifically because the initial developers want
to encourage this style of development for the project.
+ Our intent with this incubator proposal is to start building a diverse developer community
around HMS following the Apache meritocracy model. We have wanted to make the project open
source and encourage contributors from multiple organizations from the start. We plan to provide
plenty of support to new developers and to quickly recruit those who make solid contributions
to committer status.
  === Community ===
- HMS is used in Yahoo Lab for deploy test clusters.  The HMS community encourages suggestions
and contributions from any potential user and developer.
+ HMS is currently being worked on by developers from Hortonworks and there has been an expressed
interest from people at Yahoo!. There are users within Hortonworks & Yahoo! who use the
existing prototype for doing deployments of the Hadoop stack in lab environments. We hope
to extend the user and developer base further in the future and build a solid open source
community around HMS.
  === Core Developers ===
- The initial set of HMS committers includes folks from the Hadoop and Chukwa communities.
We have varying degrees of experience with Apache-style open source development.
+ HMS is currently being developed by four engineers from Hortonworks - Eric Yang, Owen O’Malley,
Vitthal (a.k.a Suhas) Gogate and Devaraj Das.  In addition, a Yahoo! employee, Jagane Sundar,
and an IBM employee, Kan Zhang, are also involved. Eric, Jagane and Kan are the original developers.
All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem.
  === Alignment ===
- HMS is a deployment system designed for Apache Hadoop. This is why Apache Hadoop is the
most important dependency for HMS. And HMS is also a particularly good fit for Apache due
to integration potential with other projects specifically Apache HBase and Apache Pig.
+ The ASF is a natural host for HMS given that it is already the home of Hadoop, Pig, HBase,
Cassandra, and other emerging cloud software projects. HMS has been designed to solve the
deployment, management and configuration problems of the Hadoop ecosystem family of products.
HMS fills the gap that Hadoop ecosystem has been lacking in the areas of configuration, deployment
and manageability.
  == Known Risks ==
  === Orphaned products & Reliance on Salaried Developers ===
- HMS is in use by companies we work for so the companies have an interest in its continued
+ The core developers plan to work full time on the project. There is very little risk of
HMS getting orphaned. HMS is in use by companies we work for so the companies have an interest
in its continued vitality.
  === Inexperience with Open Source ===
- Most of the committers have experience working on open source projects and there are also
at least one developer which has experience as committer on other Apache projects.
+ All of the core developers are active users and followers of open source. Eric Yang is a
committer on Apache Chukwa. Owen O’Malley is the lead of the Apache Hadoop project.  Devaraj
Das is an Apache Hadoop committer and Apache Hadoop PMC member. Vitthal (Suhas) Gogate has
contributed extensively to the Hadoop Vaidya project (part of Apache Hadoop). Jagane Sundar
has been contributing, in terms of ideas, to the Hadoop project. Kan Zhang is a Hadoop Committer.
+ === Homogeneous Developers ===
+ The current core developers are from Hortonworks, IBM, and, Yahoo!. However, we hope to
establish a developer community that includes contributors from several corporations.
+ === Reliance on Salaried Developers ===
+ Currently, the developers are paid to do work on HMS. However, once the project has a community
built around it, we expect to get committers and developers from outside the current core
  === Relationships with Other Apache Products ===
+ HMS is going to be used by the users of Hadoop and the Hadoop ecosystem in general.
- HMS uses Zookeeper, and design to improve Hadoop ecosystem deployment.
- As mentioned above, the current list of committers includes developers from at least two
different companies plus many independent volunteers.
  === A Excessive Fascination with the Apache Brand ===
- Apache offers us a clear licensing framework and support infrastructure which would reassure
the many users of HMS who exploit it in commercial environments as well as those in other
open source projects.
+ While we respect the reputation of the Apache brand and have no doubts that it will attract
contributors and users, our interest is primarily to give HMS a solid home as an open source
project following an established development model. We have also given reasons in the Rationale
and Alignment sections.
  == Documentation ==
+ There is documentation in Hortonworks’s internal repositories.
- The existing project page could be found here:
- HMS Architecture:
  == Initial Source ==
+ The source is currently in Hortonworks’s internal repositories.
  == Source and Intellectual Property Submission Plan ==
- The complete HMS code is under Apache Software License 2. The complete codebase is already
hosted in ASF Repository.
+ The complete HMS code is under Apache Software License 2.
  == External Dependencies ==
@@ -95, +109 @@

  == Mailing lists ==
-  * dev AT hms DOT apache DOT org
+  * hms-dev AT incubator DOT apache DOT org
-  * commits AT hms DOT apache DOT org
+  * hms-commits AT incubator DOT apache DOT org
-  * user AT hms DOT apache DOT org
+  * hms-user AT hms incubator apache DOT org
-  * private AT hms DOT apache DOT org
+  * hms-private AT incubator DOT apache DOT org
  == Subversion Directory ==
  == Issue Tracking ==
@@ -112, +126 @@

   * Devaraj Das (ddas AT apache DOT org)
   * Vitthal Suhas Gogate (gogate AT apache DOT org)
+  * Owen O'Malley (omalley AT apache DOT org)
   * Jagane Sunder (jagane AT sundar DOT org)
   * Eric Yang (eyang AT apache DOT org)
   * Kan Zhang (kzhang AT apache DOT org)
@@ -127, +142 @@

  == Sponsors ==
+  * Hortonworks
  == Champion ==
+  * Owen O'Malley
  === Nominated Mentors ===

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message