hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weidong Bian (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2373) Importing hive tables into hbase+hive requires a lot of work which often can be implied
Date Tue, 17 Jul 2012 07:53:36 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415990#comment-13415990
] 

Weidong Bian commented on HIVE-2373:
------------------------------------

I've also encountered this issue and got a quick and dirty fix for this.
the attached preliminary patch is to specify a hard coded default mapping if "WITH SERDEPROPERTIES
("hbase.columns.mapping")" is missing.
It will use the first column specified by the user as :key and "cf" as the column family name
and of course will only work if all columns are mapped to one column family.
A better approach would be allow the user to specify something like WITH SERDEPROPERTIES ("hbase.columns.mapping"
= ":key@2") to specify the second column as the :key and add the rest automatically. If anyone
is interested, I can work on this.
                
> Importing hive tables into hbase+hive requires a lot of work which often can be implied
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-2373
>                 URL: https://issues.apache.org/jira/browse/HIVE-2373
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Alex Newman
>            Priority: Minor
>
> The HiveQL way of creating a HBase table looks something like 
> REATE TABLE bla(id_1 type_1, id_2 type_2..., id_n type_n)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:id_2, cf:id_3") TBLPROPERTIES
("hbase.table.name" = "blah");
> But in most cases huge amounts of this can be assumed from the original table description.
In fact in most cases, especially ones when that data was imported from MySQL it is trivial
to generate at least one HBase backing for that data. I currently wrote a python script which
our users can use to make things simpler. Would anyone be interested in that script? Would
it make sense to make it easy from Hive? I hate to add reserved words so any suggestions are
welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message