Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Lucenehadoop Wiki" for change notification.
The following page has been changed by udanax:
http://wiki.apache.org/lucenehadoop/Hbase/Matrix
The comment on the change is:
I'll rewrite after some transformation.

 [[TableOfContents(4)]]
+ deleted
 * It's a designing stage.

 
 == LINA, a Framework for Largescale Sparse Linear Algebra ==

 Using Hbase's Row,Column(Qualifier) two dimensional space, we are able to store large sparse
matrix.
 [[BR]]The Autopartitioned sparsity substructure will be efficiently managed and serviced
by Hbase.

 Row or Column operations can be done in linear time and algorithms such as structured Gaussian
elimination
 [[BR]]or iterative methods run in '''O(~the number of nonzero elements in the matrix~)'''
time, Hbase Providing Both Vertical and Horizontal access.

 Therefore, using iterative algorithms on a parallel processing platform like !MapReduce
on Hadoop,
 [[BR]]we should be able to implement '''High Performance''' and '''World's Largest''' Matrix
Computations.

 === Applications by LINA ===

 It will support a wide variety of applications in the domain of Physics, Linear Algebra,
Relational Algebra,
 [[BR]]Statistics, Graphics Rendering, Computational Dynamics and others.

 * Scientific simulation and modeling
 * Matrixvector/matrixmatrix multiply
 * Soving linear systems
 * Collaborative filtering/recommendation systems
 * Information retrieval
 * Sorting
 * Finding eigenvalues and eigenvectors
 * Computer graphics and computational geometry
 * Matrix multiply
 * computing matrix determinate

 === Initial Contributors ===

 * [:udanax:Edward Yoon] (R&D center, NHN corp.)
 * I have profited by '''Prof. Samuel Kim''', '''Dr. Yongchan Park'''

 == Storing and manipulating numeric, sparse matrices on Hadoop + Hbase ==

 To store sparse matrix in hbase, I will use an abstracted table with a single column family,
and tune existing partitioning conditions.
 The extendable columnfamily feature combine matrices that have the same row dimension.

 === Scheduling Algorithm ===

 The scheduling algorithm is designed for driving the parallel execution of the factorization
on on a MapReduce model.

 NOTE : indirect and irregular, the inefficient data access ....

 NOTE : blocking factor........ code generator......

 * Needs
 * Total Row/Column Key Count in the matrix table header.
 * Big floating point number, Big integer calculator

 {{{

 ++ ++  ++
  . .  .   . 
  . .   .   .
   ++  ++
   > +
   ++ ++
  . . . .  . .  . .
   ++ ++
 ++ 

 }}}


 === Sums on the MapReduce Tasks of Hadoop ===

 == Examples ==

 === Collaborative Filtering Via Ensembles of Matrix Factorizations ===

 No work, no grub.

 === Fast PageRank Computation Via Sparse Linear System ===

 No work, no grub.

 === Latent Semantic Analysis Via Singular Value Decomposition ===

 No work, no grub.

 == References ==

 * High performance numerical libraries in Java, BjørnOve Heimsund
 * Parallel Conjugate Gradients Assignment, a parallel implementation of the conjugate gradient
algorithm
 * ScaLAPACK, a library of highperformance linear algebra routines for distributedmemory
messagepassing MIMD computers
 * [http://bebop.cs.berkeley.edu/oski/ OSKI], optimized sparse kernel interface (OSKI) library
 * [http://icl.cs.utk.edu/iclprojects/pages/files/sans/yelickbebop.pdf Automatic Performance
Tuning of Sparse Matrix Kernels], August 7, 2002
 * Scheduling algorithms for parallel Gaussian elimination with communication costs, Amoura,
A.K.; Bampis, E.; Konig, J.C.
 * Google's PageRank and Beyond: The Science of Search Engine Rankigs, Amy N. Langville;
Carl D. Meyer

