hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hive/HBaseIntegration" by CarlSteinbach
Date Tue, 08 Jun 2010 21:43:40 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/HBaseIntegration" page has been changed by CarlSteinbach.
http://wiki.apache.org/hadoop/Hive/HBaseIntegration?action=diff&rev1=29&rev2=30

--------------------------------------------------

+ = Hive HBase Integration =
+ 
+ <<TableOfContents>>
+ 
- = Introduction =
+ == Introduction ==
  
  This page documents the Hive/HBase integration support originally
  introduced in
@@ -15, +19 @@

  This feature is a work in progress, and suggestions for its
  improvement are very welcome.
  
- = Storage Handlers =
+ == Storage Handlers ==
  
  Before proceeding, please read [[Hive/StorageHandlers]] for an overview
  of the generic storage handler framework on which HBase integration depends.
  
- = Usage =
+ == Usage ==
  
  The storage handler is built as an independent module,
  {{{hive_hbase_handler.jar}}}, which must be available on the Hive
@@ -132, +136 @@

  validated against the existing HBase table's column families), whereas
  {{{hbase.table.name}}} is optional.
  
- = Column Mapping =
+ == Column Mapping ==
  
  The column mapping support currently available is somewhat
  cumbersome and restrictive:
@@ -148, +152 @@

  
  The next few sections provide detailed examples of the kinds of column mappings currently
possible.
  
- == Multiple Columns and Families ==
+ === Multiple Columns and Families ===
  
  Here's an example with three Hive columns and two HBase column
  families, with two of the Hive columns ({{{value1}}} and {{{value2}}})
@@ -202, +206 @@

  Time taken: 4.054 seconds
  }}}
  
- == Hive MAP to HBase Column Family ==
+ === Hive MAP to HBase Column Family ===
  
  Here's how a Hive MAP datatype can be used to access an entire column
  family.  Each row can have a different set of columns, where the
@@ -256, +260 @@

  FAILED: Error in metadata: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
org.apache.hadoop.hive.hbase.HBaseSerDe: hbase column family 'cf:' should be mapped to map<string,?>
but is mapped to map<int,int>)
  }}}
  
- == Illegal:  Hive Primitive to HBase Column Family ==
+ === Illegal:  Hive Primitive to HBase Column Family ===
  
  Table definitions such as the following are illegal because a
  Hive column mapped to an entire column family must have MAP type:
@@ -271, +275 @@

  }}}
  
  
- = Key Uniqueness =
+ == Key Uniqueness ==
  
  One subtle difference between HBase tables and Hive tables is that HBase tables have a unique
key, whereas Hive tables do not.  When multiple rows with the same key are inserted into HBase,
only one of them is stored (the choice is arbitrary, so do not rely on HBase to pick the right
one).  This is in contrast to Hive, which is happy to store multiple rows with the same key
and different values.
  
@@ -299, +303 @@

  SELECT COUNT(1) FROM pokes3 WHERE foo=498;
  }}}
  
- = Potential Followups =
+ == Potential Followups ==
  
  There are a number of areas where Hive/HBase integration could definitely use more love:
  
@@ -314, +318 @@

   * replace dependencies on deprecated HBase API's such as RowResult (HIVE-1229)
   * allow HBase WAL to be disabled (HIVE-1383)
  
- = Build =
+ == Build ==
  
  Code for the storage handler is located under
  {{{hive/trunk/hbase-handler}}}.  The Hive build automatically enables
@@ -327, +331 @@

  {{{hbase-handler/lib}}}.  We will convert this to use Ivy instead once
  the corresponding POM's are available.
  
- = Tests =
+ == Tests ==
  
  Class-level unit tests are provided under
  {{{hbase-handler/src/test/org/apache/hadoop/hive/hbase}}}.
@@ -344, +348 @@

  
  An Eclipse launch template remains to be defined.
  
- = Links =
+ == Links ==
  
   * For information on how to bulk load data from Hive into HBase, see [[Hive/HBaseBulkLoad]].
   * For another project which adds SQL-like query language support on top of HBase, see [[http://www.hbql.com|HBQL]]
(unrelated to Hive).
  
- = Acknowledgements =
+ == Acknowledgements ==
  
   * Primary credit for this feature goes to Samuel Guo, who did most of the development work
in the early drafts of the patch
  

Mime
View raw message