hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Matrix" by udanax
Date Tue, 29 Jan 2008 06:41:30 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/hadoop/Matrix

------------------------------------------------------------------------------
  == Hbase Matrix Package for Map/Reduce-based Parallel Matrix Computations ==
  
- The matrix package will be useful for some of the Large-Scale Numeric Analysis and Data
Mining which need the computation system of the Inverse Matrix for Data Mining related area
(e.g. linear regression, PCA, SVM, ..., etc).
+ The matrix package will be useful for Large-Scale Numeric Analysis and Data Mining which
need the computation system of the Inverse Matrix (e.g. linear regression, PCA, SVM, ...,
etc).
  
- Generally, The current shared-memory based parallel matrix solution provides a scalable
and high performance matrix operations, however, matrix resources can't be scalable. But,
Using Hbase's Row,Column(Qualifier) two dimensional space, we are able to store large sparse
matrix. Also, The Auto-partitioned sparsity sub-structure will be efficiently managed and
serviced by Hbase. Row or Column operations can be done in linear time and algorithms such
as structured Gaussian elimination or iterative methods run in O(~-the number of non-zero
elements in the matrix-~ / ~-number of mappers (processors/cores)-~) time on Map/Reduce.
+ Generally, The current shared-memory based parallel matrix solution provides a scalable
and high performance matrix operations, but, matrix resources can't be scalable. But, Using
Hbase's 2- dimensional Row and Column(Qualifier) space, we are able to store large sparse
matrix. Also, auto-partitioned sparsity sub-structure will be efficiently managed and serviced
by Hbase. Row or Column operations can be done in linear time and algorithms such as structured
Gaussian elimination or iterative methods run in O(~-the number of non-zero elements in the
matrix-~ / ~-number of mappers (processors/cores)-~) time on Map/Reduce. 
  
  === Initial Contributors ===
  
@@ -36, +36 @@

   * Cholesky Decomposition
  
  === Getting Start ===
+ Download the matrix package:
+ {{{
+ bash# wget http://wiki.apache.org/hadoop-data/attachments/Matrix/attachments/matrix-test_v0.0.1.tar
+ bash# ant package
+ }}}
+ After setup the configuration files of Hadoop + Hbase, type in the following:
  {{{
  bash# ./bin/hadoop jar ./lib/hadoop-0.16.0-dev-hbase.jar org.apache.hadoop.hbase.matrix.ExampleDriver
  }}}
  ----
  == Future Plans ==
-  * it needs own Input/Output formatter and splitter.
+  * It needs own Input/Output formatter and splitter.
- 
+  * Make the Decompositions and Factorizations Map/Reduce classes.
  ----
  == References ==
  

Mime
View raw message