hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bkm.had...@gmail.com
Subject Re: Review Request: HIVE-1634 - Allow access to Primitive types stored in binary format in HBase
Date Fri, 22 Oct 2010 03:11:06 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/826/
-----------------------------------------------------------

(Updated 2010-10-21 20:11:06.837430)


Review request for Hive Developers and John Sichi.


Changes
-------

The proposed serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" as a specification
of the storage option for the corresponding column in the serde property "hbase.columns.mapping"
has been removed. Instead the storage option is an optional part of the "hbase.columns.mapping"
and is specified for a column using '#' as a separator following the column family/qualifier.
Allowed values are '' for table default, a prefix of 'string' for standard string storage,
and a prefix of 'binary' for binary storage as would be obtained from o.a.h.hbase.utils.Bytes.
Map types for HBase column families use a colon separated pair such as 'str:bin' or 's:b'
for the key and value part specifiers respectively.

The tests TestHBaseSerDe, TestLazyHBaseObject, TestHBaseCliDriver, and TestHBaseMinimrCliDriver
pass.


Summary
-------

This addresses HIVE-1245 in part, for atomic or primitive types.

The serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" is a specification
of the storage option for the corresponding column in the serde property "hbase.columns.mapping".
Allowed values are '' for table default, 's' for standard string storage, and 'b' for binary
storage as would be obtained from o.a.h.hbase.utils.Bytes. Map types for HBase column families
use a colon separated pair such as 's:b' for the key and value part specifiers respectively.
See the test cases and queries for HBase handler for additional examples.

There is also a table property "hbase.table.default.storage.type" = "string" to specify a
table level default storage type. The other valid specification is "binary". The table level
default is overridden by a column level specification.

This control is available for the boolean, tinyint, smallint, int, bigint, float, and double
primitive types. The attached patch also relaxes the mapping of map types to HBase column
families to allow any primitive type to be the map key.


This addresses bug HIVE-1634.
    http://issues.apache.org/jira/browse/HIVE-1634


Diffs (updated)
-----

  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 1023967 
  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsAggregator.java 1023967

  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsPublisher.java 1023967

  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java 1023967

  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
1023967 
  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java 1023967

  trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 1023967 
  trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestSetup.java 1023967 
  trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java 1023967 
  trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java 1023967

  trunk/hbase-handler/src/test/queries/hbase_binary_external_table_queries.q PRE-CREATION

  trunk/hbase-handler/src/test/queries/hbase_binary_map_queries.q PRE-CREATION 
  trunk/hbase-handler/src/test/queries/hbase_binary_storage_queries.q PRE-CREATION 
  trunk/hbase-handler/src/test/results/hbase_binary_external_table_queries.q.out PRE-CREATION

  trunk/hbase-handler/src/test/results/hbase_binary_map_queries.q.out PRE-CREATION 
  trunk/hbase-handler/src/test/results/hbase_binary_storage_queries.q.out PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBooleanBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyByteBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyDoubleBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java 1023967 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFloatBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyIntegerBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyLongBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyShortBinary.java PRE-CREATION

  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java 1023967 

Diff: http://review.cloudera.org/r/826/diff


Testing
-------

The HBase handler tests TestHBaseSerDe, TestLazyHBaseObject, TestHBaseCliDriver, and TestHBaseMinimrCliDriver
pass.

New tests have been added to TestHBaseSerDe and TestLazyHBaseObject to test this feature.

New queries which exercise this feature have been added to query files hbase_binary_map_queries.q
and hbase_binary_storage_queries.q.


Thanks,

bkm


Mime
View raw message