hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hbase/ShellPlans" by udanax
Date Wed, 22 Aug 2007 02:20:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/ShellPlans

The comment on the change is:
add some examples

------------------------------------------------------------------------------
  
  Hbase altools is an Hbase Shell sub 'interpreter' (or 'shell)' program to provide scalable
data processing capabilities like  aggregation, algebraic calculation(groups and sets, commutative
rings, algebraic geometry, and linear algebra) on Hadoop + Hbase based parallel machines.
especially, it will focus on storing and manipulating numeric, sparse matrices on Hbase.
  
-  ~-''-- Altools operations will show how Google search's LSI, Google Earth's algebraic topology,
Google News' recommendation system are related to Bigtable.''-~
+ Altools operations will show or explain how Google search's LSI, Google Earth's algebraic
topology, Google News' recommendation system are related to Bigtable.
- 
- I suggest to develop HBase Shell in SQL-style, and develop algebraic tools as a sub shell
in Intuitionalized-style as described below. 
- 
- {{{
- HBase > altools;
- 
- Hbase altools, 0.0.1 version
- Type 'help;' for Hbase altools usage.
- 
- Hbase.altools > who are you;
- 
-  Hadoop + Hbase based algebraic manipulation tools
- 
- Hbase.altools > exit;
- Hbase > exit;
- }}}
  
  == Background ==
  I expect Hadoop + Hbase to handle sparsity and data explosion very well in near future.
Moreover, i believe the design of the multi-dimensional map structure and the 3d space model
of the data are optimized for rapid ad-hoc information retrieval in any orientation, as well
as for fast, flexible calculation and transformation of raw data based on formulaic relationships.
It is advantageous with respect to Analysis Processing as it allows users to easily formulate
complex queries, and filter or slice data into meaningful subsets, among other things.
@@ -41, +25 @@

  ||Substitute ||<99%>'''Substitute''' expression to [A~Z][[BR]][[BR]]~-''A = Table('movieLog_table');''-~
||
  ||IF...ELSE ||<99%>'''IF...ELSE''', Imposes conditions on the execution. [[BR]][[BR]]~-''IF
( boolean_expression )[[BR]]B = command_statements;[[BR]]ELSE[[BR]]B = command_statements;''-~||
  ||Store ||<99%>'''Store''' command will store results to specified table. [[BR]][[BR]]~-''A
= Table('movieLog_table'); [[BR]]B = A.Selection(length > 100); [[BR]]Store B TO table('tmp_table')[or
file('backup.dat')];''-~ ||
+ 
+ '''Type''' 'help;' for Hbase altools usage.
+ 
+ {{{
+ HBase > altools;
+ 
+ Hbase altools, 0.0.1 version
+ Type 'help;' for Hbase altools usage.
+ 
+ Hbase.altools > help;
+ }}}
  
  == Relational Operators ==
  ||<bgcolor="#E5E5E5">'''Operator''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
@@ -117, +112 @@

  === Factorization and Decomposition Operators ===
  
  ||<bgcolor="#E5E5E5">'''Function''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
- ||LU ||<99%>'''LU Decomposition'''[[BR]]A procedure for decomposing an N by N matrix
A into a product of a lower triangular matrix L and an upper triangular matrix U, LU = A.[[BR]]'''Functions'''
: ~-''getL(), getU(), isSingular(), getPivot()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B
= LUDecomposition(A);[[BR]]C = getU(B);[[BR]]D = getL(A);''-~||
+ ||LU ||<99%>'''LU Decomposition'''[[BR]]A procedure for decomposing an N by N matrix
A into a product of a lower triangular matrix L and an upper triangular matrix U, LU = A.[[BR]]'''Functions'''
: ~-''getL(), getU(), isSingular(), getPivot()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B
= LUDecomposition(A);[[BR]]C = B.getU();[[BR]]D = B.getL();''-~||
- ||QR ||<99%>'''QR Decomposition'''[[BR]]For an m-by-n matrix A with m >= n, the
QR decomposition is an m-by-n orthogonal matrix Q and an n-by-n upper triangular matrix R
so that A = Q*R.[[BR]]'''Functions''' : ~-''getH(), getQ(), getR()''-~[[BR]][[BR]]~-''A =
Matrix('m_table','cf_1');[[BR]]B = QRDecomposition(A);[[BR]]C = getH(B);''-~||
+ ||QR ||<99%>'''QR Decomposition'''[[BR]]For an m-by-n matrix A with m >= n, the
QR decomposition is an m-by-n orthogonal matrix Q and an n-by-n upper triangular matrix R
so that A = Q*R.[[BR]]'''Functions''' : ~-''getH(), getQ(), getR()''-~[[BR]][[BR]]~-''A =
Matrix('m_table','cf_1');[[BR]]B = QRDecomposition(A);[[BR]]C = B.getH();''-~||
- ||Cholesky ||<99%>'''Cholesky Decomposition'''[[BR]]It is a special case of LU decomposition
applicable only if matrix to be decomposed is symmetric positive definite.[[BR]]'''Functions'''
: ~-''getL(), getU(), isSPD()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = CholeskyDecomposition(A);[[BR]]C
= getL(A);''-~||
+ ||Cholesky ||<99%>'''Cholesky Decomposition'''[[BR]]It is a special case of LU decomposition
applicable only if matrix to be decomposed is symmetric positive definite.[[BR]]'''Functions'''
: ~-''getL(), getU(), isSPD()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = CholeskyDecomposition(A);[[BR]]C
= B.getL();''-~||
- ||SVD ||<99%>'''SV(Singular Value) Decomposition'''[[BR]]For an m-by-n matrix A with
m >= n, the singular value decomposition is an m-by-n orthogonal matrix U, an n-by-n diagonal
matrix S, and an n-by-n orthogonal matrix V so that A = U*S*V'.[[BR]]'''Functions''' : ~-''getS(),
getU(), getV(), getSingularValues()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B
= SVDecomposition(A);[[BR]]C = getU(B);''-~||
+ ||SVD ||<99%>'''SV(Singular Value) Decomposition'''[[BR]]For an m-by-n matrix A with
m >= n, the singular value decomposition is an m-by-n orthogonal matrix U, an n-by-n diagonal
matrix S, and an n-by-n orthogonal matrix V so that A = U*S*V'.[[BR]]'''Functions''' : ~-''getS(),
getU(), getV()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = SVDecomposition(A);[[BR]]C
= B.getU();''-~||
+ 
+ {{{
+ //Set up the matrix M from mapped matrix in hbase.
+ Hbase.altools > M = Matrix('m_table','cf_1'); 
+ 
+ M ([1, 2],
+    [3, 4])
+ }}}
  
  '''(ex. 1)''' To find the Singular Value decomposition in Altools, do the following:
  [[BR]]~-'''''M = UΣV*'''''-~
  
  {{{
- //Set up the matrix M from mapped matrix in hbase.
+ Hbase.altools > A = M.SVDecomposition();
+ Hbase.altools > U = A.getU();
+ Hbase.altools > S = A.getS();
+ Hbase.altools > V = A.getV();
  
- Hbase.altools > M = Matrix('m_table','cf_1'); 
+ U ([[-0.40455358, -0.9145143 ],
+     [-0.9145143 ,  0.40455358]])
+ 
+ S ([ 5.4649857 ,  0.36596619])
+ 
+ V ([[-0.57604844, -0.81741556],
+     [ 0.81741556, -0.57604844]])
+ }}}
+ 
+ '''(ex. 2)''' To find the QR decomposition in Altools, do the following:
+ [[BR]]~-'''''M = QR'''''-~
+ 
+ {{{
+ Hbase.altools > A = M.QRDecomposition();
- Hbase.altools > U = M.getU();
+ Hbase.altools > U = A.getQ();
- Hbase.altools > V = M.getV();
+ Hbase.altools > U = A.getR();
+ 
+ Q ([[-0.31622777, -0.9486833 ],
+     [-0.9486833 ,  0.31622777]])
+ 
+ R ([[-3.16227766, -4.42718872],
+     [ 0.        , -0.63245553]])
  }}}
  
  = Papers =
+  * [http://www.uib.no/People/nmabh/art/hpj.pdf High performance numerical libraries in Java]
   * ''[http://labs.google.com/papers/bigtable.html Bigtable] : A Distributed Storage System
for Structured Data''
   * ''Interpreting the Data: Parallel Analysis with [http://labs.google.com/papers/sawzall.html
Sawzall]''
   * ''Y!'s Research Project : [http://research.yahoo.com/project/pig Pig] Document''

Mime
View raw message