Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by udanax:
http://wiki.apache.org/hadoop/Hama

== Introduction ==
'''Hama''' is a parallel matrix computational package based on Hadoop Map/Reduce. ''(Hama
is in korean, which means 'Hippo').'' It will be useful for a massively largescale ''Numerical
Analysis'' and ''Data Mining'', which need the intensive computation power of matrix inversion,
e.g. linear regression, PCA, SVM and etc. It will be also useful for many scientific applications,
e.g. physics computations, linear algebra, computational fluid dynamics, statistics, graphic
rendering and many more.
 Currently, several sharedmemory based parallel matrix solutions can provide a scalable
and high performance matrix operations, but matrix resources can not be scalable in the term
of complexity. The '''Hama''' approach proposes the use of 2dimensional Row and Column(Qualifier)
space and multidimensional Columnfamilies of Hbase, which is able to store large sparse and
various type of matrices (e.g. Triangle Matrix, 3D Matrix, and etc.). In addition, autopartitioned
sparsity substructure will be efficiently managed and serviced by Hbase. Row and Column operations
can be done in lineartime, where several algorithms such as structured Gaussian elimination
and iterative methods run in O(~the number of nonzero elements in the matrix~ / ~number
of mappers (processors/cores)~) time on Hadoop Map/Reduce.
+ Currently, several sharedmemory based parallel matrix solutions can provide a scalable
and high performance matrix operations, but matrix resources can not be scalable in the term
of complexity. The '''Hama''' approach proposes the use of 2dimensional Row and Column(Qualifier)
space and multidimensional Columnfamilies of Hbase, which is able to store large sparse and
various type of matrices (e.g. Triangular Matrix, 3D Matrix, and etc.). In addition, autopartitioned
sparsity substructure will be efficiently managed and serviced by Hbase. Row and Column operations
can be done in lineartime, where several algorithms such as structured Gaussian elimination
and iterative methods run in O(~the number of nonzero elements in the matrix~ / ~number
of mappers (processors/cores)~) time on Hadoop Map/Reduce.
=== Initial Contributors ===
* [:udanax:Edward Yoon] (R&D center, NHN corp.)
* Chanwit Kaewkasi (Ph.D candidate, University of Manchester)
