hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hbase/ShellPlans" by udanax
Date Tue, 21 Aug 2007 02:38:33 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/ShellPlans

------------------------------------------------------------------------------
  [[TableOfContents(5)]]
  ----
+  ''-- After POC(proof of concept) review, many things can change.[[BR]]-- This project is
currently in the planning stage.  [https://issues.apache.org/jira/browse/HADOOP-1608 HADOOP-1608]
to add "Relational Algrebra Operators" is currently in process.[[BR]]-- If you have constructive
ideas, Please advise me. [[MailTo(webmaster AT SPAMFREE udanax DOT org)]]''
  
  = Hbase Shell Altools Plan =
  
-  ''-- After POC(proof of concept) review, many things can change.[[BR]]-- If you have constructive
ideas, Please advise me. [[MailTo(webmaster AT SPAMFREE udanax DOT org)]][[BR]]-- This project
is currently in the planning stage.  [https://issues.apache.org/jira/browse/HADOOP-1608 HADOOP-1608]
to add "Relational Algrebra Operators" is currently in process.''
- 
  Hbase altools is an Hbase Shell sub 'interpreter' (or 'shell)' program to provide scalable
data processing capabilities like  aggregation, algebraic calculation(groups and sets, commutative
rings, algebraic geometry, and linear algebra) on Hadoop + Hbase based parallel machines.
especially, it will focus on storing and manipulating numeric, sparse matrices on Hbase.
  
-  ''-- Altools Matrix operations will show how Google search's LSI, Google Earth's algebraic
topology, Google News' recommendation system are related to Bigtable.''
+  ~-''-- Altools operations will show how Google search's LSI, Google Earth's algebraic topology,
Google News' recommendation system are related to Bigtable.''-~
  
- I suggest to develop HBase Shell in SQL-style, and develop '''al'''gebraic '''tools''' as
a sub shell in Intuitionalized-style as described below. 
+ I suggest to develop HBase Shell in SQL-style, and develop algebraic tools as a sub shell
in Intuitionalized-style as described below. 
  
  {{{
  HBase > altools;
@@ -52, +51 @@

  ||Sort ||<99%>'''Sort''' of tuples(rows) of R, ordered according to columnfamilies
on columnfamily-list.[[BR]][[BR]]~-''A = Table('movieLog_table');[[BR]]B = Sort A by ('length');
'''//τ,,length,,(A)''' ''-~ ||
  
  '''(ex. 1)''' Search the subject and the year of the movies which were produced by 'Fox'
company and where running time is more than 100 minutes.
- [[BR]]~-''π ,,title.year,, (σ ,,length > 100,, (movieLog_table) ∩ σ ,,studioName
= 'Fox',, (movieLog_table))''-~
+ [[BR]]~-'''''π ,,title.year,, (σ ,,length > 100,, (movieLog_table) ∩ σ ,,studioName
= 'Fox',, (movieLog_table))'''''-~
  
  {{{
  Hbase.altools > A = Table('movieLog_table'); 
@@ -63, +62 @@

  }}}
  
  '''(ex. 2)''' Theta Join : ▷◁,,C,,
- [[BR]]~-''movieStars_table▷◁,,actor < year,,movieLog_table = σ,,actor < year,,(movieStars_table
X movieLog_table)''-~
+ [[BR]]~-'''''movieStars_table▷◁,,actor < year,,movieLog_table = σ,,actor < year,,(movieStars_table
X movieLog_table)'''''-~
  
  {{{
  Hbase.altools > A = Table('movieStars_table'); 
@@ -74, +73 @@

  }}}
  
  '''(ex. 3)''' Find the year of the earliest movie for each actor.
- [[BR]]~-''γ ,,starName.MIN(year) → minYear,, (movieStars_table)''-~
+ [[BR]]~-'''''γ ,,starName.MIN(year) → minYear,, (movieStars_table)'''''-~
  
  {{{
  Hbase.altools > A = Table('movieStars_table');
@@ -91, +90 @@

  ||Division ||<99%>'''Division''' is solving the matrix equation AX = B for X.[[BR]][[BR]]~-''A
= Matrix('m_table','cf_1');[[BR]]B = Matrix('m_table','cf_2');[[BR]]C = A /[or \] B; '''//
C = A / B''' ''-~||
  ||Transpose ||<99%>'''Transpose''' of a Matrix, A matrix which is formed by turning
all the rows of a given matrix into columns and vice-versa.[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B
= Transpose(A); '''// B = A'''' ''-~||
  
- '''(ex. 1)''' The product C of two matrices A and B
- [[BR]]~-''C,,ij,, = ΣA,,ik,,B,,kj,, (1 ≤ i ≤ m , 1 ≤ j ≤n)''-~
+ 
+ '''(ex. 1)''' Matrix Addition
+ [[BR]]~-'''''C = A + B = (a,,ij,, + b,,ij,,)'''''-~
  
  {{{
+ //Set up the matrix A, B from mapped matrix in hbase.
+ 
  Hbase.altools > A = Matrix('m_table','cf_1');
  Hbase.altools > B = Matrix('m_table','cf_2');
- Hbase.altools > C = A * B;  
+ Hbase.altools > C = A + B;
  }}}
  
+ 
+ '''(ex. 2)''' The product C of two matrices A and B
+ [[BR]]~-'''''C,,ij,, = ΣA,,ik,,B,,kj,, (1 ≤ i ≤ m , 1 ≤ j ≤n)'''''-~
+ 
+ {{{
+ //Set up the matrix A, B from mapped matrix in hbase.
+ 
+ Hbase.altools > A = Matrix('m_table','cf_1');
+ Hbase.altools > B = Matrix('m_table','cf_2');
+ Hbase.altools > C = A * B;
+ }}}
+ 
- == Factorizations and Decompositions ==
+ === Factorization and Decomposition Operators ===
  
  ||<bgcolor="#E5E5E5">'''Function''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
  ||LU ||<99%>'''LU Decomposition'''[[BR]]A procedure for decomposing an N by N matrix
A into a product of a lower triangular matrix L and an upper triangular matrix U, LU = A.[[BR]]'''Functions'''
: ~-''getL(), getU(), isSingular(), getPivot()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B
= LUDecomposition(A);[[BR]]C = getU(B);[[BR]]D = getL(A);''-~||
@@ -109, +123 @@

  ||SVD ||<99%>'''SV(Singular Value) Decomposition'''[[BR]]For an m-by-n matrix A with
m >= n, the singular value decomposition is an m-by-n orthogonal matrix U, an n-by-n diagonal
matrix S, and an n-by-n orthogonal matrix V so that A = U*S*V'.[[BR]]'''Functions''' : ~-''getS(),
getU(), getV(), getSingularValues()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B
= SVDecomposition(A);[[BR]]C = getU(B);''-~||
  
  '''(ex. 1)''' To find the Singular Value decomposition in Altools, do the following:
- [[BR]]~-''M = UΣV*''-~
+ [[BR]]~-'''''M = UΣV*'''''-~
  
  {{{
- Hbase.altools > M = Matrix('m_table','cf_1'); //Set up the matrix M from mapped matrix
in hbase.
+ //Set up the matrix M from mapped matrix in hbase.
+ 
+ Hbase.altools > M = Matrix('m_table','cf_1'); 
  Hbase.altools > U = M.getU();
  Hbase.altools > V = M.getV();
  }}}
  
  = Papers =
-  * ''Bigtable: A Distributed Storage System for Structured Data''
-  * ''Interpreting the Data: Parallel Analysis with Sawzall''
-  * ''Y!'s Research Project : Pig Document''
+  * ''[http://labs.google.com/papers/bigtable.html Bigtable] : A Distributed Storage System
for Structured Data''
+  * ''Interpreting the Data: Parallel Analysis with [http://labs.google.com/papers/sawzall.html
Sawzall]''
+  * ''Y!'s Research Project : [http://research.yahoo.com/project/pig Pig] Document''
+  * ''[http://portal.acm.org/citation.cfm?doid=1247480.1247602 Map-Reduce-Merge] : Simplified
Relational Data Processing on Large Clusters''
-  * ''Py-Tables - Hierarchical Datasets in Python''
+  * ''[http://www.pytables.org/ PyTables] : Hierarchical Datasets in Python''
-  * ''Numpy - Scientific Tools for Python''
-  * ''C-Store: A Column Oriented DBMS''
+  * ''[http://numpy.scipy.org/ Numpy] : Scientific Tools for Python''
+  * ''[http://db.lcs.mit.edu/projects/cstore/ C-Store] : A Column Oriented DBMS''
  

Mime
View raw message