hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/Groovy" by Misty
Date Mon, 19 Oct 2015 02:11:58 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/Groovy" page has been changed by Misty:
https://wiki.apache.org/hadoop/Hbase/Groovy?action=diff&rev1=5&rev2=6

  == Using HBase From Groovy ==
  
- The normal HBase Java API may be used directly in Groovy.  This page describes a 'builder'
class to make some HBase methods more convenient.  Namely, table structure manipulation is
done in a heirarchical fashion similar to DDL, and scanners and batch updates are performed
in a closure scope to ease iteration and resource clean-up.
+ This page is obsolete and the Github project it referred to no longer exists.
  
- Note that this is not part of the 'official' HBase API, but a community contribution.  It
is a single class that should be easy enough to include in your own project source.  The code
is released under the ASF 2.0 license.  To get support or submit enhancements to this code,
please email the [[http://hadoop.apache.org/hbase/mailing_lists.html|HBase mailing lists]].
- 
- The code may be [[http://github.com/tomstrummer/HBaseBuilder/tree/|downloaded from Github]].
- 
- 
- === Examples ===
- 
- Creating or modifying a table:
- {{{#!java
- /* Create:  this will create a table if it does not exist, or disable
-    & update column families if the table already does exist.  The table 
-    will be enabled when the create statement returns */
- 
- def hbase = HBaseBuilder.connect()
- 
- hbase.create( 'myTable' ) {
-  family( 'familyOne' ) {
-    inMemory = true
-    bloomFilter = false
-  }
-  // create second family w/ the default options:
-  family 'familyTwo'
- }
- }}}
- 
- Inserting or updating rows:
- {{{#!java
- // Insert or update rows:
- hbase.update( 'myTable' ) {
-  row( 'rowOne' ) {
-    family( 'familyOne' ) {
-      col 'one', 'someValue'
-      col 'two', 'anotherValue'
-      col 'three', 1234
-      // note that doubles aren't supported as of HBase v0.18, but will be 'soon'
-    }
-    // alternate form that doesn't use nested family name:
-    col 'familyOne:four', 12345
-  }
-  row( 'rowTwo' ) { /* more column values */ }
-  // etc
-  // TODO - row method that accepts rowKey & map of column name/ values
- }
- }}}
- 
- Scanning: 
- {{{#!java
- hbase.tableName = 'myTable' // set a default table name
- 
- /* Scan a table, passing each RowResult to the given closure.  Since RowResult 
-    implements SortedMap, all of Groovy's Map operations are available here (like
-    each, [], etc.  But keep in mind the values are byte arrays if accessed in this
-    fashion.  So as a convenience, the RowResult has some methods added to it - 
-    getString, getInt, getLong, and getDate */
- 
- hbase.scan( cols : ['fam1:col1', 'fam2:*'],
-            // all other named params are optional:
-            start : '001', end : '200',
-            // any timestamp args may be long, Date or Calendar
-            timestamp : Date.parse( 'yy/mm/dd HH:MM:ss', '08/11/25 05:00:00' )
-            ) { row ->
- 
-   println "${row.key} : ${row.getString('fam1:col1')}" 
- }
- }}}
- 
- 
- 
- === A More Realistic Example ===
- 
- Say you wanted to do batch loading of data from CSV files and insert the data to HBase.
 The code could be written as a Groovy script that looks like this:
- 
- {{{#!java
- hbase.update( 'myTable' ) {
-   new File( 'someFile.csv' ).eachLine { line ->
-     def values = line.split(',')
-     row( values[0] ) {
-       col 'fam1:val1', values[1]
-       col 'fam1:val2', values[2]
-     }
-   }
- }
- }}}
- 
- === Changelog ===
-  * '''21/03/2009''' - Updated for HBase 0.19
-  * '''28/11/2008''' - Initial upload
- 

Mime
View raw message