hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hbase/ShellPlans" by udanax
Date Wed, 08 Aug 2007 04:55:05 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/ShellPlans

------------------------------------------------------------------------------
   * '''Syntax definition.'''
    * [:udanax:Edward Yoon], Master.[[BR]]Open Collaboration, NHN corp.
    * Inchul Song, Ph.D. Candidate[[BR]]Database Lab[[BR]]Division of Computer Science, KAIST
-  * '''Code Implementation.'''
-   * [:udanax:Edward Yoon], Master.[[BR]]Open Collaboration, NHN corp.
-   * Inchul Song, Ph.D. Candidate[[BR]]Database Lab[[BR]]Division of Computer Science, KAIST
-   * Minsu Kim, System Engineer at Daum corp.
-   * Sewon Kim, System Engineer at Empas corp.
  
  If you have constructive ideas, please advise me. webmaster@udanax.org
  
  == Suggested Hbase Shell plans ==
+ === Hbase Query Language ===
- 
-  ''-- Inchul, Feel free to add your opinion.[[BR]]udanax''
- 
-  * [:Hbase/HbaseShell/HQL] - I've made some changes to your initial HQL to make it look
more like SQL. I borrowed the syntax definition style from MySQL.
+ I've made some changes to your initial HQL to make it look more like SQL. I borrowed the
syntax definition style from MySQL. 
+  -- [:Hbase/HbaseShell/HQL] by Inchul Song
  
  ----
  
@@ -39, +33 @@

   Hadoop + Hbase based algebraic manipulation tools
  
  Hbase.altools > exit;
- Hbase > eixt;
+ Hbase > exit;
  }}}
  Hbase altools is an Hbase Shell sub 'interpreter' (or 'shell)' program to provide scalable
data processing capabilities like  aggregation, algebraic calculation(groups and sets, commutative
rings, algebraic geometry, and linear algebra) on Hadoop + Hbase based parallel machines.

  
-  ''-- Altools Matrix operations will show how Google search's LSI, Google Earth's algebraic
topology, Google News' recommendation system are related to Bigtable. See the HBase Shell
[:Hbase/HbaseShell/Examples] Page''
+  ''-- Altools Matrix operations will show how Google search's LSI, Google Earth's algebraic
topology, Google News' recommendation system are related to Bigtable. See the HBase Shell
Usage Page. --[:Hbase/HbaseShell/Examples]''
  
  === Hbase altools Goals ===
   * A Simplified Import/Export/Migrate Functionality Between different data sources (Hadoop,
HBase)
@@ -61, +55 @@

  ||Openness to live data access by other applications ||Excellent ||Limited ||
  ||Priorities ||High perfomance, High availability ||High flexibility, High user autonomy
||
  
- 
  Thus, I decided to develop a shell to process linear algebraic computing and large scale
data using Hadoop's parallel processing and HBase storage.
  
  ''Then you may ask "What is a difference from MapReduce using MapFiles?"''
  
- I don't expect it to give us a high-performance just yet, but it will sure make data management
and development much easier. 
+ I don't expect it to give us a high-performance just yet, but it will sure make data management
and development much easier. First, let's take a look at HBase's data model. HBase provides
a unified data model and it represents a data in 3-dimensional - Row, Column, and TImestamp.
Also, Row and Column may be extended infinitely.
  
+ If we decide to cut the data model in time version, then we may view the new data as a 2D
table. If index is in string, we may view it as a huge map. If index is in integer, then it
is one huge 2D array. So each table may have such data storages in 3D (Columnfamilies) Locality
Group(Columnfamilies) is a relationship that can occur between multiple references whenever
one reference brings in much of the data used by the other references.
- First, let's take a look at HBase's data model. HBase provides a unified data model and
it represents a data in 3-dimensional - Row, Column, and TImestamp. Also, Row and Column may
be extended infinitely. If we decide to cut the data model in time version, then we may view
the new data as a 2D table. If index is in string, we may view it as a huge map. If index
is in integer, then it is one huge 2D array. 
- 
- So each table may have such data storages in 3D (Columnfamilies) Locality Group(Columnfamilies)
is a relationship that can occur between multiple references whenever one reference brings
in much of the data used by the other references.
  
  ----
  
- === Suggested Hbase altools Operators ===
+ === Suggested Hbase altools Syntax ===
- 
- work in progress.
- 
  '''Note''' that Data should be located by their row, column, and timestamp.
  
  ==== Commands ====
  ||<bgcolor="#E5E5E5">'''Command''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
- ||Table ||'''Table''' command load from specified table. [[BR]][[BR]]~-''A = Table('movieLog_table');''-~
||
+ ||Table ||<99%>'''Table''' command loads specified table. [[BR]][[BR]]~-''A = Table('movieLog_table');''-~
||
- ||Matrix ||'''Matrix''' command control the configuration of the logic matrix. [[BR]][[BR]]~-''M
= Matrix(table_name, columnfamily_name[, scalar S]);''-~ ||
+ ||Matrix ||<99%>'''Matrix''' command constructs the configuration of the logic matrix.
[[BR]][[BR]]~-''M = Matrix(table_name, columnfamily_name[, scalar S]);''-~ ||
- ||Substitute || '''Substitute''' expression to [A~Z][[BR]][[BR]]~-''A = Table('movieLog_table');''-~
||
+ ||Substitute ||<99%>'''Substitute''' expression to [A~Z][[BR]][[BR]]~-''A = Table('movieLog_table');''-~
||
- ||Store ||'''Store''' command will store results to specified table. [[BR]][[BR]]~-''A =
Table('movieLog_table'); [[BR]]B = A.Selection(length > 100); [[BR]]Store B TO table('tmp_table')[or
file('backup.dat')];''-~ ||
+ ||Store ||<99%>'''Store''' command will store results to specified table. [[BR]][[BR]]~-''A
= Table('movieLog_table'); [[BR]]B = A.Selection(length > 100); [[BR]]Store B TO table('tmp_table')[or
file('backup.dat')];''-~ ||
  ==== Relational Operators ====
  ||<bgcolor="#E5E5E5">'''Operator''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
  ||Projection ||<99%>'''Projection''' of a relation ~+R+~, It makes a new relation
as the set that is obtained when all tuples(rows) in ~+R+~ are restricted to the set {columnfamily,,1,,,...,columnfamily,,n,,}.[[BR]][[BR]]~-''A
= Table('movieLog_table');[[BR]]B = A.Projection('year','length');''-~ ||
  ||Selection ||<99%>'''Selection''' of a relation ~+R+~, It makes a new relation as
the set of specified tuples(rows) of the relation ~+R+~[[BR]]'''Set Operations''' : ~-''OR,
AND, NOT''-~[[BR]][[BR]]~-''A = Table('movieLog_table');[[BR]]B = A.Selection(length >
100 AND studioName = 'Fox');''-~ ||
  ||Group ||<99%>'''Group''' tuples by value of an attribute and apply aggregate function
independently to each group of tuples.[[BR]]'''Aggregate Functions''' : ~-''AVG( attribute
), SUM( attribute ), COUNT( attribute ), MIN( attribute ), MAX( attribute )''-~[[BR]][[BR]]~-''A
= Table('movieLog_table);[[BR]]B = A.Group('studioName', MIN('year'));''-~ ||
- ||Sort ||<99%>'''Sort''' of tuples(rows) of R, ordered according to columnfamilies
on columnfamily-list[[BR]][[BR]]~-''A = Table('movieLog_table');[[BR]]B = Sort by ('length');''-~
||
+ ||Sort ||<99%>'''Sort''' of tuples(rows) of R, ordered according to columnfamilies
on columnfamily-list[[BR]][[BR]]~-''A = Table('movieLog_table');[[BR]]B = Sort A by ('length');''-~
||
  
  ==== Matrix Arithmetic Operators ====
  ||<bgcolor="#E5E5E5">'''Operator''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
@@ -110, +98 @@

  ----
  = Implementation =
  
- '''Note''' : We should first test on local machines.
+ '''Note''' : ''We should first test on local machines. -- udanax''
- [[BR]]Java code formatting style. [http://www.hadoop.co.kr/wiki/moin.cgi/HBaseShell?action=AttachFile&do=get&target=uncle-jim-code-style.xml]
+ [[BR]] ''Code Style Formatter'' [attachment:uncle-jim-code-style.xml]
+ [[BR]]http://wiki.apache.org/lucene-hadoop/CodeReviewChecklist
  
  {{{
  Run the following: 
@@ -132, +121 @@

  ----
  = Example Of Hbase Shell Use =
  
- See the HBase Shell [:Hbase/HbaseShell/Examples] Page
+ See the HBase Shell Usage Page [:Hbase/HbaseShell/Examples]
  

Mime
View raw message