incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edw...@udanax.org>
Subject [VOTE] Accept Hama into the Incubator
Date Tue, 13 May 2008 23:27:51 GMT
Dear Incubator PMC,

There has been some discussion around the Hama proposal,
and we would now like to officially propose Hama to the Incubator
for consideration, with Grant Ingersoll's +1.

Please vote on accepting the Hama project for incubation. The full
Hama proposal is available at the end of this message and as a wiki
page at http://wiki.apache.org/incubator/HamaProposal. We ask the
Incubator PMC to sponsor the Hama podling, with myself, Ian Holsman,
and Jeff Eastman as the mentors.

The vote is open for the next 72 hours and only votes from the
Incubator PMC are binding.

[ ] +1 Accept Hama as a new podling
[ ] -1 Do not accept the new podling (provide reason, please)

----
== Abstract ==

Hama will develop a parallel matrix computational package based on
[http://hadoop.apache.org Hadoop] Map/Reduce.

== Proposal ==

Hama will develop a parallel matrix computational package, which
provides an library of matrix operations for the large-scale
processing development environment and Map/Reduce framework for the
large-scale Numerical Analysis and Data Mining, which need the
intensive computation power of matrix inversion, e.g. linear
regression, PCA, SVM and etc. It will be also useful for many
scientific applications, e.g. physics computations, linear algebra,
computational fluid dynamics, statistics, graphic rendering and many
more.

== Background ==

Currently, several shared-memory based parallel matrix solutions can
provide a scalable and high performance matrix operations, but matrix
resources can not be scalable in the term of complexity. And, Hadoop
HDFS Files and Map/Reduce can only used by 1D blocked algorithm.

== Rationale ==

Hama approach proposes the use of 3-dimensional Row and Column
(Qualifier), Time space and multi-dimensional Columnfamilies of
[http://hadoop.apache.org/hbase Hbase], which is able to store large
sparse and various type of matrices (e.g. Triangular Matrix, 3D
Matrix, and etc.) and utilize the 2D blocked algorithm. its
auto-partitioned sparsity sub-structure will be efficiently managed
and serviced by Hbase. Row and Column operations can be done in
linear-time, where several algorithms, such as ''structured Gaussian
elimination'' or ''iterative methods'', run in O(the number of
non-zero elements in the matrix / number of mappers) time on Hadoop
Map/Reduce.

== Current Status ==

In its current state, the 'hama' is buggy and needs filling out, but
generalized matrix interface and basic linear algebra operations was
implemented within a large prototype system. In the future, We need
new parallel algorithms based on Map/Reduce for performance of heavy
decompositions and factorizations. It also needs tools to compose an
arbitrary matrix only with certain data filtered from hbase array
structure.

== Meritocracy ==

The initial developers are very familiar with meritocratic open source
development, both at Apache and elsewhere. Apache was chosen
specifically because the initial developers want to encourage this
style of development for the project.

=== Community ===

Hama seeks to develop developer and user communities during incubation.

== Core Developers ==

The initial set of committers includes folks from the
[http://hadoop.apache.org Hadoop] & [http://hadoop.apache.org/hbase
Hbase] communities. We have varying degrees of experience with
Apache-style open source development, ranging from none to ASF
Members.

== Alignment ==

The developers of Hama want to work with the Apache Software
Foundation specifically because Apache has proven to provide a strong
foundation and set of practices for developing standards-based
infrastructure and server components.

== Known Risks ==
=== Orphaned products ===

Most of the active developers would like to become Hama Committers or
PMC Members and have long term interest to develop/maintain and
'''use''' the code.

=== Inexperience with Open Source ===

We has already a good experience with Apache open source development process.

=== Homogenous Developers ===

The current list of committers includes developers from several
different companies ([http://en.wikipedia.org/wiki/NHN NHN, corp],
TMAX software, Korea Research Institute of Bioscience and
Biotechnology, Students) plus many independent volunteers. The
committers are geographically distributed across the Europe, and Asia.
They are experienced with working in a distributed environment.

=== Reliance on Salaried Developers ===

It is expected that Hama development will occur on both salaried time
and on volunteer time, after hours. While there is reliance on
salaried developers (currently from [http://en.wikipedia.org/wiki/NHN
NHN, corp], but it's expected that other company's salaried developers
will also be involved), the Hama Community is very active and things
should balance out fairly quickly. In the meantime,
[http://en.wikipedia.org/wiki/NHN NHN, corp] might support the project
in the future by dedicating 'work time' to Hama, so that there is a
smooth transition.

=== Relationships with Other Apache Products ===

Hama has a strong relationship with Apache [http://hadoop.apache.org
Hadoop], [http://hadoop.apache.org/hbase Hbase] and
[http://lucene.apache.org/mahout Mahout]. Being part of Apache could
help for a closer collaboration between the three projects.

=== A Excessive Fascination with the Apache Brand ===

We believe in the processes, systems, and framework Apache has put in
place. The brand is nice, but is not why we wish to come to Apache.

== Documentation ==

 * http://code.google.com/p/hama/w/list

== Initial Source ==

 * http://code.google.com/p/hama/source/checkout

== External Dependencies ==

 * Hadoop (HDFS, Map/Reduce) License: Apache License, 2.0
 * Hbase (Sparse Matrix Table) License: Apache License, 2.0

== Required Resources ==

 * Developer and user mailing lists
  * hama-commits@incubator.apache.org
  * hama-dev@incubator.apache.org
  * hama-user@incubator.apache.org
 * A subversion repository
  *  https://svn.apache.org/repos/asf/incubator/hama
 * A JIRA issue tracker

== Initial Committers ==

 * Edward J. Yoon, (edward AT udanax DOT org)
 * Chanwit Kaewkasi, (chanwit AT gmail DOT com)
 * Cha MinChang, (minslovey AT gmail DOT com)
 * Suh ChangHee, (bluesvm AT gmail DOT com)
 * Ha Yongho, (yongho.ha AT gmail DOT com)
 * Hong Taehui, (hongtebari AT gmail DOT com)
 * Yoon JooSun, (ologist0 AT gmail DOT com)
 * Takkiel Shim, (tkshim AT gmail DOT com)
 * Donguk Choi, (alloe130 AT gmail DOT com)

== Sponsors ==
=== Nominated Mentors ===

 * Ian Holsman, (ianh AT apache DOT org)
 * Jeff Eastman, (jeastman AT windwardsolutions DOT com)
 * Edward J. Yoon, (edward AT udanax DOT org)

=== Sponsoring Entity ===
The Apache Incubator.

-- 
B. Regards,
Edward J. Yoon,
http://blog.udanax.org

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message