hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hbase/HbaseShell" by udanax
Date Tue, 18 Sep 2007 13:25:15 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell

The comment on the change is:
add a comment

------------------------------------------------------------------------------
  = Project Links =
   * [:Hbase/HbaseShell/HQL: Hbase Query Language] Data Manipulation Statements that help
manipulate data in the Hbase
   * [:Hbase/RDF: HbaseRDF] for storing and querying RDF data.
-  * [:ShellPlans: Shell Plans] page for discussion and description of future operators. 
+  * [:Hbase/HbaseShell/ShellPlans: Shell Plans] page for discussion and description of future
operators. 
  ----
  = Initial Contributor =
   * [:udanax:Edward Yoon] (R&D center, NHN corp.)
@@ -172, +172 @@

  = Comments =
  
  Please add comments related to the project below.
+ ----
+ Commented by [:udanax: Edward yoon] 2007/09/18
  
+ '''Case 1 : My initial style'''
+ 
+ {{{
+  Hbase > A = table('movieLog_table');
+  Hbase > B = A.Group('RunningTime', SUM('vote'));
+  Hbase > Store B to table('m_table');
+ }}}
+ 
+ '''Case 2 : SQL style'''
+ 
+ {{{
+  Hbase > CREATE TABLE m_table ('vote');
+  Hbase > INSERT INTO m_table ('vote')
+      --> SELECT  * FROM  
+      --> (SELECT 
+      --> RunningTime,
+      --> sum('vote:user1') vote:user1, 
+      --> sum('vote:user2') vote:user2, 
+      --> sum('vote:user3') vote:user3, 
+      --> sum('vote:user4') vote:user4 
+          ...
+          ..
+      --> FROM movieLog_table
+      --> GROUP BY RunningTime)
+      --> ORDER BY 1; 
+ }}}
+ 
+ In either case, return the same results.
+ 
+ '''m_table : '''
+ 
+ {{{
+  Row                     Columnfamilies
+ ------------    ----------------------------------  
+  RunningTime          vote                ...
+ ------------     -------------------   ----------
+   112             vote:user2   1100       ...
+                   vote:user4   1500
+   124             vote:user3   1600
+   125             vote:user3   2850
+   131             vote:user1   2450
+                   vote:user4   3050
+                   ...
+ }}}
+ 
+ The expected A matrix of (Movie Running Time by User) would be as sampled below :
+ [[BR]]Note : Cell data is an aggregate value of voting.
+ 
+ {{{
+            Vote:user1 vote:user2 vote:user3 vote:user4
+ ---------- ---------- ---------- ---------- ----------
+    112                      1100                  1500
+    124                                 1600 
+    125                                 2850 
+    131           2450                             3050
+ }}}
+ 
+ Now, you want the analysis to find a unknown relationship between User and Movie Running
Time?
+ 
+ {{{
+ Hbase > ok, you can find a unknown relationship using the algebraic operations.
+ }}}
+ 
+ So, Here's my questions.
+ 
+  * What is a better syntax for Multi-dimensional Algebraic Operations commands?
+  * I don't think Sub-Shell(Separated Altools) is good. How think about it?
+  * Should i attempt to imitate SQL? I think it may be more difficult than we expected.
+ 
+ Any comments are welcomed.
+ [[BR]]Thank you.
+ 

Mime
View raw message