Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2DABDCEC9 for ; Thu, 20 Jun 2013 02:16:21 +0000 (UTC) Received: (qmail 19586 invoked by uid 500); 20 Jun 2013 02:16:20 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 19473 invoked by uid 500); 20 Jun 2013 02:16:20 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 19465 invoked by uid 500); 20 Jun 2013 02:16:20 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 19461 invoked by uid 99); 20 Jun 2013 02:16:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jun 2013 02:16:20 +0000 Date: Thu, 20 Jun 2013 02:16:20 +0000 (UTC) From: "Navis (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-4765) Improve HBase bulk loading facility MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Navis created HIVE-4765: --------------------------- Summary: Improve HBase bulk loading facility Key: HIVE-4765 URL: https://issues.apache.org/jira/browse/HIVE-4765 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor With some patches, bulk loading process for HBase could be simplified a lot. {noformat} CREATE EXTERNAL TABLE hbase_export(rowkey STRING, col1 STRING, col2 STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseExportSerDe' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:key,cf2:value") STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.hbase.HiveHFileExporter' LOCATION '/tmp/export'; SET mapred.reduce.tasks=4; set hive.optimize.sampling.orderby=true; INSERT OVERWRITE TABLE hbase_export SELECT * from (SELECT union_kv(key,key,value,":key,cf1:key,cf2:value") as (rowkey,union) FROM src) A ORDER BY rowkey,union; hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles /tmp/export test {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira