hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1634) Allow access to Primitive types stored in binary format in HBase
Date Mon, 13 Sep 2010 21:15:46 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909003#action_12909003
] 

HBase Review Board commented on HIVE-1634:
------------------------------------------

Message from: bkm.hadoop@gmail.com

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/826/
-----------------------------------------------------------

Review request for Hive Developers and John Sichi.


Summary
-------

This addresses HIVE-1245 in part, for atomic or primitive types.

The serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" is a specification
of the storage option for the corresponding column in the serde property "hbase.columns.mapping".
Allowed values are '' for table default, 's' for standard string storage, and 'b' for binary
storage as would be obtained from o.a.h.hbase.utils.Bytes. Map types for HBase column families
use a colon separated pair such as 's:b' for the key and value part specifiers respectively.
See the test cases and queries for HBase handler for additional examples.

There is also a table property "hbase.table.default.storage.type" = "string" to specify a
table level default storage type. The other valid specification is "binary". The table level
default is overridden by a column level specification.

This control is available for the boolean, tinyint, smallint, int, bigint, float, and double
primitive types. The attached patch also relaxes the mapping of map types to HBase column
families to allow any primitive type to be the map key.


This addresses bug HIVE-1634.
    http://issues.apache.org/jira/browse/HIVE-1634


Diffs
-----

  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 990439 
  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java 990439 
  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 990439 
  trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java 990439 
  trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java 990439

  trunk/hbase-handler/src/test/queries/hbase_binary_map_queries.q PRE-CREATION 
  trunk/hbase-handler/src/test/queries/hbase_binary_storage_queries.q PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBooleanBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyByteBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyDoubleBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java 990439 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFloatBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyIntegerBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyLongBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyShortBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java 990439 

Diff: http://review.cloudera.org/r/826/diff


Testing
-------

The HBase handler tests TestHBaseSerDe, TestLazyHBaseObject, TestHBaseCliDriver, and TestHBaseMinimrCliDriver
pass.

New tests have been added to TestHBaseSerDe and TestLazyHBaseObject to test this feature.

New queries which exercise this feature have been added to query files hbase_binary_map_queries.q
and hbase_binary_storage_queries.q.


Thanks,

bkm




> Allow access to Primitive types stored in binary format in HBase
> ----------------------------------------------------------------
>
>                 Key: HIVE-1634
>                 URL: https://issues.apache.org/jira/browse/HIVE-1634
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: HBase Handler
>    Affects Versions: 0.7.0
>            Reporter: Basab Maulik
>            Assignee: Basab Maulik
>         Attachments: HIVE-1634.0.patch, TestHiveHBaseExternalTable.java
>
>
> This addresses HIVE-1245 in part, for atomic or primitive types.
> The serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" is a specification
of the storage option for the corresponding column in the serde property "hbase.columns.mapping".
Allowed values are '-' for table default, 's' for standard string storage, and 'b' for binary
storage as would be obtained from o.a.h.hbase.utils.Bytes. Map types for HBase column families
use a colon separated pair such as 's:b' for the key and value part specifiers respectively.
See the test cases and queries for HBase handler for additional examples.
> There is also a table property "hbase.table.default.storage.type" = "string" to specify
a table level default storage type. The other valid specification is "binary". The table level
default is overridden by a column level specification.
> This control is available for the boolean, tinyint, smallint, int, bigint, float, and
double primitive types. The attached patch also relaxes the mapping of map types to HBase
column families to allow any primitive type to be the map key.
> Attached is a program for creating a table and populating it in HBase. The external table
in Hive can access the data as shown in the example below.
> hive> create external table TestHiveHBaseExternalTable
>     > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
>     >  c_int int, c_long bigint, c_string string, c_float float, c_double double)
>     >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>     >  with serdeproperties ("hbase.columns.mapping" = ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double")
>     >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
> OK
> Time taken: 0.691 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1	NULL	NULL	NULL	NULL	NULL	Test-String	NULL	NULL
> Time taken: 0.346 seconds
> hive> drop table TestHiveHBaseExternalTable;
> OK
> Time taken: 0.139 seconds
> hive> create external table TestHiveHBaseExternalTable
>     > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
>     >  c_int int, c_long bigint, c_string string, c_float float, c_double double)
>     >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>     >  with serdeproperties (
>     >  "hbase.columns.mapping" = ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
>     >  "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" )
>     >  tblproperties (
>     >  "hbase.table.name" = "TestHiveHBaseExternalTable",
>     >  "hbase.table.default.storage.type" = "string");
> OK
> Time taken: 0.139 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1	true	-128	-32768	-2147483648	-9223372036854775808	Test-String	-2.1793132E-11	2.01345E291
> Time taken: 0.151 seconds
> hive> drop table TestHiveHBaseExternalTable;
> OK
> Time taken: 0.154 seconds
> hive> create external table TestHiveHBaseExternalTable
>     > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
>     >  c_int int, c_long bigint, c_string string, c_float float, c_double double)
>     >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>     >  with serdeproperties (
>     >  "hbase.columns.mapping" = ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
>     >  "hbase.columns.storage.types" = "-,b,b,b,b,b,-,b,b" )
>     >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
> OK
> Time taken: 0.347 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1	true	-128	-32768	-2147483648	-9223372036854775808	Test-String	-2.1793132E-11	2.01345E291
> Time taken: 0.245 seconds
> hive> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message