hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Madhusudhana Rao Podila <>
Subject Problem with Hive/HBase integration
Date Fri, 27 Jan 2012 05:37:36 GMT

I have a problem in create a Hive table using existing HBase table (using External Table concept)
with multiple columns from column family (not using as Map)

Case-1 :
I have created a table in HBase and able to map to Hive as an external table just using only
one column from the column family

Created the table in HBase using the following command

hbase(main):001:0> create 'hbasetohive', 'colfamily'

0 row(s) in 1.9700 seconds

hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'

0 row(s) in 0.2240 seconds

hbase(main):003:0> scan 'hbasetohive'

ROW                    COLUMN+CELL

 1s                    column=colfamily:val, timestamp=1327676987075, value=1strowva


1 row(s) in 0.0840 seconds


hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")

    > TBLPROPERTIES("" = "hbasetohive");


Time taken: 10.808 seconds

hive> select * from hbase_hivetable_k;


1s      1strowval

Time taken: 1.314 seconds

Case 2

I have created a table in HBase with column family as cf_cdr with two columns caller_name,
caller_number; Then I tried creating the Hive table using the HBase table that got created
by specifying both columns from the column family,  It is throwing Metaexteception: If I restrict
to only one column am able to create the table in Hive properly


hbase(main):004:0> create 'hb_cdr', 'cf_cdr'

0 row(s) in 1.4870 seconds

hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'

0 row(s) in 0.0490 seconds

hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number', '08877232010'

0 row(s) in 0.0300 seconds

hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'

0 row(s) in 0.0170 seconds

hbase(main):008:0> scan 'hb_cdr'

ROW                    COLUMN+CELL

 cdr_r1                column=cf_cdr:caller_name, timestamp=1327677898993, value=mad


 cdr_r1                column=cf_cdr:caller_number, timestamp=1327677912648, value=0


 cdr_r2                column=cf_cdr:caller_name, timestamp=1327677919720, value=bha


2 row(s) in 0.1020 seconds


hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")

    > TBLPROPERTIES("" = "hb_cdr");

FAILED: Error in metadata: MetaException(message:Column Family  cf_cdr is not defined in hbase
table hb_cdr)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

Is there anything issue in the above script?

Please suggest

Madhusudhana Rao. Podila

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

View raw message