Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BE799DC90 for ; Tue, 17 Jul 2012 07:53:41 +0000 (UTC) Received: (qmail 64946 invoked by uid 500); 17 Jul 2012 07:53:38 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 64401 invoked by uid 500); 17 Jul 2012 07:53:38 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 64308 invoked by uid 500); 17 Jul 2012 07:53:37 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 64234 invoked by uid 99); 17 Jul 2012 07:53:36 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jul 2012 07:53:36 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 4697E140B94 for ; Tue, 17 Jul 2012 07:53:36 +0000 (UTC) Date: Tue, 17 Jul 2012 07:53:36 +0000 (UTC) From: "Weidong Bian (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <2109342271.62723.1342511616291.JavaMail.jiratomcat@issues-vm> In-Reply-To: <894484748.34884.1313188291070.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HIVE-2373) Importing hive tables into hbase+hive requires a lot of work which often can be implied MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415990#comment-13415990 ] Weidong Bian commented on HIVE-2373: ------------------------------------ I've also encountered this issue and got a quick and dirty fix for this. the attached preliminary patch is to specify a hard coded default mapping if "WITH SERDEPROPERTIES ("hbase.columns.mapping")" is missing. It will use the first column specified by the user as :key and "cf" as the column family name and of course will only work if all columns are mapped to one column family. A better approach would be allow the user to specify something like WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key@2") to specify the second column as the :key and add the rest automatically. If anyone is interested, I can work on this. > Importing hive tables into hbase+hive requires a lot of work which often can be implied > --------------------------------------------------------------------------------------- > > Key: HIVE-2373 > URL: https://issues.apache.org/jira/browse/HIVE-2373 > Project: Hive > Issue Type: Improvement > Reporter: Alex Newman > Priority: Minor > > The HiveQL way of creating a HBase table looks something like > REATE TABLE bla(id_1 type_1, id_2 type_2..., id_n type_n) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:id_2, cf:id_3") TBLPROPERTIES ("hbase.table.name" = "blah"); > But in most cases huge amounts of this can be assumed from the original table description. In fact in most cases, especially ones when that data was imported from MySQL it is trivial to generate at least one HBase backing for that data. I currently wrote a python script which our users can use to make things simpler. Would anyone be interested in that script? Would it make sense to make it easy from Hive? I hate to add reserved words so any suggestions are welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira