hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hbase/HbaseShell/ShellPlans" by udanax
Date Fri, 02 Nov 2007 12:15:00 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell/ShellPlans

------------------------------------------------------------------------------
  
   * https://issues.apache.org/jira/browse/HADOOP-1608
  ==== Projection ====
-   * selects a subset of the columnfamilies of a relation
+  * selects a subset of the columnfamilies of a relation
-   * Result = π ,,column_list,, (Relation) 
+  * Result = π ,,column_list,, (Relation) 
  {{{
  Relation
  +---------------------------------------------------+
@@ -38, +38 @@

  +------------------------------+
  }}}
  ==== Selection ====
-   * selects a subset of the rows in a relation that satisfy a selection condition
+  * selects a subset of the rows in a relation that satisfy a selection condition
-   * Result = σ ,,selection_condition,, (Relation) 
+  * Result = σ ,,selection_condition,, (Relation) 
  {{{
  Relation
  +---------------------------------------------------+
@@ -62, +62 @@

  }}}
  
  ==== Group ====
+  * http://issues.apache.org/jira/browse/HADOOP-1658
-   * Aggregation functions on collections of data values: average, minimum, maximum, sum,
count.
+  * Aggregation functions on collections of data values: average, minimum, maximum, sum,
count.
-   * Group rows by value of an columnfamily and apply aggregate function independently to
each group of rows.
+  * Group rows by value of an columnfamily and apply aggregate function independently to
each group of rows.
-   * <Grouping columnfamilies> ƒ ,,function_list,, (Relation) 
+  * <Grouping columnfamilies> ƒ ,,function_list,, (Relation) 
  {{{
  Hbase > Group Relation by (studioName, SUM('vote:user'));
  }}}
  
  ==== θ Join ====
+  * http://issues.apache.org/jira/browse/HADOOP-2021
-   * The join of two relations R1(A,,1,, ,A,,2,, ,...,A,,n,,) and R2(B,,1,, ,B,,2,, ,...,B,,m,,)
is a relation with degree k=n+m and attributes (A,,1,, ,A,,2,, ,...,A,,n,, , B,,1,, ,B,,2,,
,...,B,,m,,) that satisfy the join condition 
+  * The join of two relations R1(A,,1,, ,A,,2,, ,...,A,,n,,) and R2(B,,1,, ,B,,2,, ,...,B,,m,,)
is a relation with degree k=n+m and attributes (A,,1,, ,A,,2,, ,...,A,,n,, , B,,1,, ,B,,2,,
,...,B,,m,,) that satisfy the join condition 
    * Result = R1 ▷◁ ,,θ join_condition,, R2
  {{{
  R1
@@ -129, +131 @@

  }}}
  
  === Linear Algebra ===
+  * Proof of concept implementation for hbase-based Matrix Computing
+   * https://issues.apache.org/jira/browse/HADOOP-1655
  
  === Algebraic Geometry ===
  
+  * Not Yet!
- ----
- = Execution Strategy =
  
- I think it's very difficult to implement it, but we will necessarily discuss later.
- 
- == References ==
-  * http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96524/c20paral.htm
  
  ----
- 
  = Some Ideas Note =
  
  {{{
  select column_qualifier1, column_qualifier2 from 2d_table(table_name, columnfamily_name)
where row='row key';
  }}}
  
+  * User Defined Function (UDF)
- 
- Start Transaction, Commit, and Rollback Syntax 
  
  {{{
- START TRANSACTION ON 'row-key' OF table_name | BEGIN ON 'row-key' OF table_name
- COMMIT ['timestamp']
- ROLLBACK 
- }}}
- 
- You can group together a sequence of data manipulation statements in a single-row transaction.
- 
- The START TRANSACTION and BEGIN statements begin a new single-row transaction under the
specified 'row-key' of table_name. 
- 
- COMMIT commits the current transaction, making its changes permanent. If timestamp is specified
on commit, all the modifications under the single-row transaction are stored with the specified
timestamp. If not, they are stored with the current time as their timestamps.
- 
- ROLLBACK rolls back the current transaction, canceling its changes. 
- 
- By default, for every statement execution that updates a table, Hbase immediately stores
the update on disk.
- 
- ''~- TRANSACTION on a row-level only -- and this is all you could guarantee in HBase --
may be a bit-over-the-top and require more effort than its worth.  How about implementing
this one last, if it is needed at all? -- St.Ack-~''
- 
- 
- User Defined Function (UDF)
- 
- {{{
- create function isValidUrl(address) returning boolean
+ create function isValidUrl(address) returning boolean ??
- 
- insert into webtable (url) values("http://www.google.com");
- insert into webtable (url) values("http://www.naver.com");
- 
- select * from webtablewhere isValidUrl(url); ???
- 
- //problems : full scan? or map/reduce?, how to print out results?
- 
- 
- create function isValidUrl(address) returning boolean
  
  insert into webtable (anchor) values("http://www.google.com");
  insert into webtable (anchor) values("http://www.naver.com");
@@ -191, +157 @@

  select * from webtable where row = 'http://blog.udanax.org' and isValidUrl(anchor);
  }}}
  
- Find the theaters in the radius 2km from specified center.
+  * Find the theaters in the radius 2km from specified center.
  
  {{{
  2D geographic data table :

Mime
View raw message