incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: [VOTE] Accept Hama into the Incubator
Date Wed, 14 May 2008 16:46:48 GMT
+1


On May 13, 2008, at 7:27 PM, Edward J. Yoon wrote:

> Dear Incubator PMC,
>
> There has been some discussion around the Hama proposal,
> and we would now like to officially propose Hama to the Incubator
> for consideration, with Grant Ingersoll's +1.
>
> Please vote on accepting the Hama project for incubation. The full
> Hama proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/HamaProposal. We ask the
> Incubator PMC to sponsor the Hama podling, with myself, Ian Holsman,
> and Jeff Eastman as the mentors.
>
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [ ] +1 Accept Hama as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)
>
> ----
> == Abstract ==
>
> Hama will develop a parallel matrix computational package based on
> [http://hadoop.apache.org Hadoop] Map/Reduce.
>
> == Proposal ==
>
> Hama will develop a parallel matrix computational package, which
> provides an library of matrix operations for the large-scale
> processing development environment and Map/Reduce framework for the
> large-scale Numerical Analysis and Data Mining, which need the
> intensive computation power of matrix inversion, e.g. linear
> regression, PCA, SVM and etc. It will be also useful for many
> scientific applications, e.g. physics computations, linear algebra,
> computational fluid dynamics, statistics, graphic rendering and many
> more.
>
> == Background ==
>
> Currently, several shared-memory based parallel matrix solutions can
> provide a scalable and high performance matrix operations, but matrix
> resources can not be scalable in the term of complexity. And, Hadoop
> HDFS Files and Map/Reduce can only used by 1D blocked algorithm.
>
> == Rationale ==
>
> Hama approach proposes the use of 3-dimensional Row and Column
> (Qualifier), Time space and multi-dimensional Columnfamilies of
> [http://hadoop.apache.org/hbase Hbase], which is able to store large
> sparse and various type of matrices (e.g. Triangular Matrix, 3D
> Matrix, and etc.) and utilize the 2D blocked algorithm. its
> auto-partitioned sparsity sub-structure will be efficiently managed
> and serviced by Hbase. Row and Column operations can be done in
> linear-time, where several algorithms, such as ''structured Gaussian
> elimination'' or ''iterative methods'', run in O(the number of
> non-zero elements in the matrix / number of mappers) time on Hadoop
> Map/Reduce.
>
> == Current Status ==
>
> In its current state, the 'hama' is buggy and needs filling out, but
> generalized matrix interface and basic linear algebra operations was
> implemented within a large prototype system. In the future, We need
> new parallel algorithms based on Map/Reduce for performance of heavy
> decompositions and factorizations. It also needs tools to compose an
> arbitrary matrix only with certain data filtered from hbase array
> structure.
>
> == Meritocracy ==
>
> The initial developers are very familiar with meritocratic open source
> development, both at Apache and elsewhere. Apache was chosen
> specifically because the initial developers want to encourage this
> style of development for the project.
>
> === Community ===
>
> Hama seeks to develop developer and user communities during  
> incubation.
>
> == Core Developers ==
>
> The initial set of committers includes folks from the
> [http://hadoop.apache.org Hadoop] & [http://hadoop.apache.org/hbase
> Hbase] communities. We have varying degrees of experience with
> Apache-style open source development, ranging from none to ASF
> Members.
>
> == Alignment ==
>
> The developers of Hama want to work with the Apache Software
> Foundation specifically because Apache has proven to provide a strong
> foundation and set of practices for developing standards-based
> infrastructure and server components.
>
> == Known Risks ==
> === Orphaned products ===
>
> Most of the active developers would like to become Hama Committers or
> PMC Members and have long term interest to develop/maintain and
> '''use''' the code.
>
> === Inexperience with Open Source ===
>
> We has already a good experience with Apache open source development  
> process.
>
> === Homogenous Developers ===
>
> The current list of committers includes developers from several
> different companies ([http://en.wikipedia.org/wiki/NHN NHN, corp],
> TMAX software, Korea Research Institute of Bioscience and
> Biotechnology, Students) plus many independent volunteers. The
> committers are geographically distributed across the Europe, and Asia.
> They are experienced with working in a distributed environment.
>
> === Reliance on Salaried Developers ===
>
> It is expected that Hama development will occur on both salaried time
> and on volunteer time, after hours. While there is reliance on
> salaried developers (currently from [http://en.wikipedia.org/wiki/NHN
> NHN, corp], but it's expected that other company's salaried developers
> will also be involved), the Hama Community is very active and things
> should balance out fairly quickly. In the meantime,
> [http://en.wikipedia.org/wiki/NHN NHN, corp] might support the project
> in the future by dedicating 'work time' to Hama, so that there is a
> smooth transition.
>
> === Relationships with Other Apache Products ===
>
> Hama has a strong relationship with Apache [http://hadoop.apache.org
> Hadoop], [http://hadoop.apache.org/hbase Hbase] and
> [http://lucene.apache.org/mahout Mahout]. Being part of Apache could
> help for a closer collaboration between the three projects.
>
> === A Excessive Fascination with the Apache Brand ===
>
> We believe in the processes, systems, and framework Apache has put in
> place. The brand is nice, but is not why we wish to come to Apache.
>
> == Documentation ==
>
> * http://code.google.com/p/hama/w/list
>
> == Initial Source ==
>
> * http://code.google.com/p/hama/source/checkout
>
> == External Dependencies ==
>
> * Hadoop (HDFS, Map/Reduce) License: Apache License, 2.0
> * Hbase (Sparse Matrix Table) License: Apache License, 2.0
>
> == Required Resources ==
>
> * Developer and user mailing lists
>  * hama-commits@incubator.apache.org
>  * hama-dev@incubator.apache.org
>  * hama-user@incubator.apache.org
> * A subversion repository
>  *  https://svn.apache.org/repos/asf/incubator/hama
> * A JIRA issue tracker
>
> == Initial Committers ==
>
> * Edward J. Yoon, (edward AT udanax DOT org)
> * Chanwit Kaewkasi, (chanwit AT gmail DOT com)
> * Cha MinChang, (minslovey AT gmail DOT com)
> * Suh ChangHee, (bluesvm AT gmail DOT com)
> * Ha Yongho, (yongho.ha AT gmail DOT com)
> * Hong Taehui, (hongtebari AT gmail DOT com)
> * Yoon JooSun, (ologist0 AT gmail DOT com)
> * Takkiel Shim, (tkshim AT gmail DOT com)
> * Donguk Choi, (alloe130 AT gmail DOT com)
>
> == Sponsors ==
> === Nominated Mentors ===
>
> * Ian Holsman, (ianh AT apache DOT org)
> * Jeff Eastman, (jeastman AT windwardsolutions DOT com)
> * Edward J. Yoon, (edward AT udanax DOT org)
>
> === Sponsoring Entity ===
> The Apache Incubator.
>
> -- 
> B. Regards,
> Edward J. Yoon,
> http://blog.udanax.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message