incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@hortonworks.com>
Subject [PROPOSAL] HMS Project for the Apache Incubator
Date Sat, 20 Aug 2011 00:31:59 GMT
Greetings All,

We would like to propose HMS Project for inclusion in ASF Incubator as a
new podling. HMS is monitoring, administration and lifecycle management
project for Apache Hadoop clusters. The complete proposal can be found at:

http://wiki.apache.org/incubator/HMSProposal

The initial contents of this proposal are also pasted below for convenience.

Thanks and Regards,
Eric

= HMS Proposal =

== Abstract ==

HMS is monitoring, administration and lifecycle management project for
Apache Hadoop clusters.

== Proposal ==

HMS will simplify the process of deployment, configuration, management
and monitoring of the collection of Hadoop services and applications
that compose a Hadoop cluster. The collection of services (Hadoop
Stack) will include at least HDFS, !MapReduce, HBase, Hive, HCatalog,
Pig and Zookeeper. HMS will be easily configurable to add additional
services and applications to the stack. Our plan is to support the
Hadoop stack as a unit of deployment and configuration where only
certain pre-tested versions of software components are supported to be
part of Hadoop stack. Administrators can always enable/disable the
individual software components from the Hadoop stack per their
deployment needs.

The main use cases that HMS is trying to address are the following:
 * Hadoop stack deployment and upgrades
 * Hadoop services configuration & management
 * Administration of Hadoop services
  * Includes starting and stopping services
  * Hadoop system maintenance tasks, such as fsck, format, re-balance,
and compaction
 * User access & quota management on Hadoop clusters
 * Easily check and be alerted to failures in Hadoop servers
 * Automated discovery of new machines that become available
 * Expanding and contracting Hadoop clusters
 * Automatic resynchronization to ‘desired’ state (of Hadoop stack) to
handle faulty nodes
 * Handle node burn-ins (stress test nodes using Hadoop before
deploying them for production use)
 * Simple monitoring and management UI
 * Dynamic configuration - Hadoop configuration deduced from machine
attributes (e.g., RAM, CPU, Disk)
 * Operational HBase-based (inspired by OpenTSDB) monitoring for Hadoop clusters
 * Make it possible for administrators to deploy other Hadoop related
services and client applications

HMS is targeted to administrators responsible for managing Hadoop
clusters. HMS leverages existing data center management and monitoring
infrastructure - Nagios, LDAP, Kerberos, etc. All HMS functionality
and data will be accessible via RESTFUL APIs and command line tools to
facilitate its integration with existing data center management
suites.

For the bare metal provisioning, the cluster admins continue to use
their  existing infrastructure. Provisioning a machine from scratch is
not in the scope of the current roadmap.

== Background ==

Hadoop’s ecosystem includes many projects (HDFS, !MapReduce, Pig,
HBase, etc.). In many cases, users and operators typically want to
deploy a combination of some projects as a stack. It takes a
significant amount of time to get a properly configured Hadoop cluster
up and running. HMS has been designed to solve that problem. HMS
automates the whole process of deploying a stack.

HMS is being developed by developers employed with Yahoo!, Hortonworks
and IBM. Such a tool would have a large number of users and increase
the adoption of Apache Hadoop’s ecosystem. We are therefore proposing
to make HMS Apache open source.

== Rationale ==

Hadoop clusters are complicated and difficult to deploy and manage.
The HMS project aims to improve the usability of Apache Hadoop.  Doing
so will demoncratize Apache Hadoop, growing its community and
increasing the places Hadoop can be used and the problems it can
solve.   By developing HMS in Apache we hope to gather a diverse
community of contributors, helping to make sure that HMS is deployable
in as many different situations as possible.  members of the Hadoop
development community will be able to influence HMS’s roadmap, and
contribute to it.   We believe having HMS as part of the Apache Hadoop
ecosystem will be a great benefit to all of Hadoop's users.

== Current Status ==

Prototype available, developed by the list of initial committers.

=== Meritocracy ===

Our intent with this incubator proposal is to start building a diverse
developer community around HMS following the Apache meritocracy model.
We have wanted to make the project open source and encourage
contributors from multiple organizations from the start. We plan to
provide plenty of support to new developers and to quickly recruit
those who make solid contributions to committer status.

=== Community ===

We are happy to report that multiple organizations are already
represented by initial team.  We hope to extend the user and developer
base further in the future and build a solid open source community
around HMS.

=== Core Developers ===

HMS is currently being developed by four engineers from Hortonworks -
Eric Yang, Owen O’Malley, Vitthal (a.k.a Suhas) Gogate and Devaraj
Das.  In addition, a Yahoo! employee, Jagane Sundar, and an IBM
employee, Kan Zhang, are also involved. Eric, Jagane and Kan are the
original developers. All the engineers have deep expertise in Hadoop
and are quite familiar with the Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for HMS given that it is already the home of
Hadoop, Pig, HBase, Cassandra, and other emerging cloud software
projects. HMS has been designed to solve the deployment, management
and configuration problems of the Hadoop ecosystem family of products.
HMS fills the gap that Hadoop ecosystem has been lacking in the areas
of configuration, deployment and manageability.

== Known Risks ==

=== Orphaned products & Reliance on Salaried Developers ===

The core developers plan to work full time on the project. There is
very little risk of HMS getting orphaned. HMS is in use by companies
we work for so the companies have an interest in its continued
vitality.

=== Inexperience with Open Source ===

All of the core developers are active users and followers of open
source. Eric Yang is a committer on Apache Chukwa. Owen O’Malley is
the lead of the Apache Hadoop project.  Devaraj Das is an Apache
Hadoop committer and Apache Hadoop PMC member. Vitthal (Suhas) Gogate
has contributed extensively to the Hadoop Vaidya project (part of
Apache Hadoop). Jagane Sundar has been contributing, in terms of
ideas, to the Hadoop project. Kan Zhang is a Hadoop Committer.

=== Homogeneous Developers ===

The current core developers are from Hortonworks, IBM, and, Yahoo!.
However, we hope to establish a developer community that includes
contributors from several corporations.

=== Reliance on Salaried Developers ===

Currently, the developers are paid to do work on HMS. However, once
the project has a community built around it, we expect to get
committers and developers from outside the current core developers.

=== Relationships with Other Apache Products ===

HMS is going to be used by the users of Hadoop and the Hadoop
ecosystem in general.

=== A Excessive Fascination with the Apache Brand ===

While we respect the reputation of the Apache brand and have no doubts
that it will attract contributors and users, our interest is primarily
to give HMS a solid home as an open source project following an
established development model. We have also given reasons in the
Rationale and Alignment sections.

== Documentation ==

There is documentation in Hortonworks’s internal repositories.

== Initial Source ==

The source is currently in Hortonworks’s internal repositories.

== Source and Intellectual Property Submission Plan ==

The complete HMS code is under Apache Software License 2.

== External Dependencies ==

The dependencies all have Apache compatible licenses. These include
BSD, MIT licensed dependencies.

== Cryptography ==

None

== Required Resources ==

== Mailing lists ==

 * hms-dev AT incubator DOT apache DOT org
 * hms-commits AT incubator DOT apache DOT org
 * hms-user AT hms incubator apache DOT org
 * hms-private AT incubator DOT apache DOT org

== Subversion Directory ==

https://svn.apache.org/repos/asf/incubator/hms

== Issue Tracking ==

JIRA HMS

== Initial Committers ==

 * Devaraj Das (ddas AT apache DOT org)
 * Vitthal Suhas Gogate (gogate AT apache DOT org)
 * Owen O'Malley (omalley AT apache DOT org)
 * Jagane Sunder (jagane AT sundar DOT org)
 * Eric Yang (eyang AT apache DOT org)
 * Kan Zhang (kzhang AT apache DOT org)

== Affiliations ==

 * Devaraj Das (Hortonworks)
 * Vitthal Suhas Gogate (Hortonworks)
 * Owen O'Malley (Hortonworks)
 * Jagane Sunder (Yahoo)
 * Eric Yang (Hortonworks)
 * Kan Zhang (IBM)

= Sponsors =

== Champion ==

 * Owen O'Malley

=== Nominated Mentors ===

 * Owen O'Malley
 * Arun C Murthy

=== Sponsoring Entity ===

Incubator PMC

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message