Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Lucenehadoop Wiki" for change notification.
The following page has been changed by udanax:
http://wiki.apache.org/lucenehadoop/Hbase/ShellPlans

== Background ==
I expect Hadoop + Hbase to handle sparsity and data explosion very well in near future. Moreover, i believe the design of the multidimensional map structure and the 3d space model of the data are optimized for rapid adhoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of raw data based on formulaic relationships. It is advantageous with respect to Analysis Processing as it allows users to easily formulate complex queries, and filter or slice data into meaningful subsets, among other things.
+
+ == Parallelization ==
+
+ Altools provides automatic parallelization of the most timeconsuming relational/matrix/vector operations, and will ensure that the iterative solvers are scalable.
+
+ * Parallel Processing of Relational Data
+ * Parallel Algorithms of MultiDimensional Matrix Operation
+ * Parallel Gaussian Elimination Algorithm

@@ 77, +85 @@
Hbase.altools > store B to table('result_table');
}}}
+ == Matrix Operators ==
+ '''Note''' that matrix operations are the core of many linear systems.
 == Matrix Arithmetic Operators ==
+ === Arithmetic Operators ===
'''Operator''' '''Explanation''' 
Addition <99%>'''Adding''' entries with the same indices. [[BR]][[BR]]~''A = Matrix('m_table','cf_1');[[BR]]B = Matrix('m_table','cf_2');[[BR]]C = A + B; '''// c,,ij,, = a,,ij,, + b,,ij,, (i : row key, j : column key)''' ''~ 
Subtraction <99%>'''Subtracting''' entries with the same indices.[[BR]][[BR]]~''A = Matrix('m_table','cf_1');[[BR]]B = Matrix('m_table','cf_2');[[BR]]C = A  B; '''// c,,ij,, = a,,ij,,  b,,ij,, (i : row key, j : column key)''' ''~ 
@@ 115, +125 @@
LU <99%>'''LU Decomposition'''[[BR]]A procedure for decomposing an N by N matrix A into a product of a lower triangular matrix L and an upper triangular matrix U, LU = A.[[BR]]'''Functions''' : ~''getL(), getU(), isSingular(), getPivot()''~ [[BR]][[BR]]~''A = Matrix('m_table','cf_1');[[BR]]B = LUDecomposition(A);[[BR]]C = B.getU();[[BR]]D = B.getL();''~
QR <99%>'''QR Decomposition'''[[BR]]For an mbyn matrix A with m >= n, the QR decomposition is an mbyn orthogonal matrix Q and an nbyn upper triangular matrix R so that A = Q*R.[[BR]]'''Functions''' : ~''getH(), getQ(), getR()''~[[BR]][[BR]]~''A = Matrix('m_table','cf_1');[[BR]]B = QRDecomposition(A);[[BR]]C = B.getH();''~
Cholesky <99%>'''Cholesky Decomposition'''[[BR]]It is a special case of LU decomposition applicable only if matrix to be decomposed is symmetric positive definite.[[BR]]'''Functions''' : ~''getL(), getU(), isSPD()''~ [[BR]][[BR]]~''A = Matrix('m_table','cf_1');[[BR]]B = CholeskyDecomposition(A);[[BR]]C = B.getL();''~
 SVD <99%>'''SV(Singular Value) Decomposition'''[[BR]]For an mbyn matrix A with m >= n, the singular value decomposition is an mbyn orthogonal matrix U, an nbyn diagonal matrix S, and an nbyn orthogonal matrix V so that A = U*S*V'.[[BR]]'''Functions''' : ~''getS(), getU(), getV()''~ [[BR]][[BR]]~''A = Matrix('m_table','cf_1');[[BR]]B = SVDecomposition(A);[[BR]]C = B.getU();''~
+ SVD <99%>'''SV(Singular Value) Decomposition'''[[BR]]For an mbyn matrix A with m >= n, the singular value decomposition is an mbyn orthogonal matrix U, an nbyn diagonal matrix S, and an nbyn orthogonal matrix V so that A = U*S*V'.[[BR]]'''Functions''' : ~''getS(), getU(), getV(), norm2(), rank()''~ [[BR]][[BR]]~''A = Matrix('m_table','cf_1');[[BR]]B = SVDecomposition(A);[[BR]]C = B.getU();''~
{{{
//Set up the matrix M from mapped matrix in hbase.
@@ 170, +180 @@
* ~''d'' : Column number of matrix X~
* ~''m'' : Rank of matrix X ( ≤ min(t,d) )~

Row Key <6>Column Families 
term <2> corpus: <2>link: <2>.. 
t01 corpus:c01 0 link:cnn.com 1 .. .. 
@@ 184, +193 @@
Hbase.altools > X = Matrix('t_table','corpus');
}}}
+ This example used tf*idf for more efficient and economical calculations.
+
* ~''w,,ij,, = tf,,ij,, '''·''' log( N/df,,j,, )''~
+ * ~''w'' : weighted value~
+ * ~''tf'' : the number of times the word appears in a ith document~
+ * ~''N'' : the number of documents of an entire corpus~
+ * ~''df'' : the number of documents containing the jth term~
 c01 c02 .. 
t01 0.00 0.00 .. 
t02 2.08 0.00 .. 
.. .. .. .. 
+ Weighted Matrix W
+
{{{
//Diagonal eigenvalue S
 Hbase.altools > M = X.SVDecomposition();
+ Hbase.altools > M = W.SVDecomposition();
Hbase.altools > S = M.getS();
}}}
@@ 207, +224 @@
Hbase.altools > D = M.getV();
}}}
 Dimension reduction (k = 3)
+ SVD allows a simple strategy for optimal approximate fit using smaller matrices. If the singular
+ values in S0 are ordered by size, the first k largest values may be kept and the remaining smaller ones
+ set to zero. The product of the resulting matrices is a matrix X′ which is only approximately equal to X,
+ and is of rank k. Since zeros were introduced into S0, the representation can be simplified by deleting
+ the zero rows and columns of S0 to obtain a new diagonal matrix S, and deleting the corresponding
+ columns of T0 and D0 to obtain T and D respectively. The result is a reduced model :
+
+ * ~''X′ = TSDT ≈ X''~
+
+ which is the rankk model with the best possible least square fit to X.
+
+ This example used (k = 3).
= Papers =
* [http://www.uib.no/People/nmabh/art/hpj.pdf High performance numerical libraries in Java]