hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Demai Ni <>
Subject load TPCH HBase tables through Hive
Date Mon, 02 Mar 2015 23:42:31 GMT
hi, folks,

I am using the HBaseintergration feature from hive ( to load
TPCH tables into HBase. Hive 0.13 and HBase 0.98.6.

The load works well. However, as documented here:

The key uniqueness prevents me from loading all 'lineitem' rows. As
'lineitem' table is using "L_ORDERKEY, L_LINENUMBER" as compound primary
key. If I only mapped to 'L_ORDERKEY" as hbase key(aka, row #). Many rows
will get overwritten.

Any suggestion? someone on this list must go through this already. :-).

BTW, here is my hive ddl.

create table hbase_lineitem( *l_orderkey bigint*, l_partkey bigint,
l_suppkey int, l_linenumber  bigint, l_quantity  double, l_extendedprice
double, l_discount  double, l_tax  double, l_returnflag  string,
l_linestatus  string, l_shipdate  string, l_commitdate  string,
l_receiptdate  string, l_shipinstruct  string, l_shipmode  string,
l_comment  string ) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
("hbase.columns.mapping"* = ":key*,l_partkey:val,l_suppkey:val,
l_linenumber:val, l_quantity:val, l_extendedprice:val, l_discount:val,
l_tax:val, l_returnflag:val, l_linestatus:val, l_shipdate:val,
l_commitdate:val, l_receiptdate:val, l_shipinstruct:val, l_shipmode:val,
l_comment:val ") TBLPROPERTIES ("" = "lineitem");

insert overwrite table hbase_lineitem select * from lineitem;


View raw message