Return-Path: Delivered-To: apmail-hive-dev-archive@www.apache.org Received: (qmail 44849 invoked from network); 8 Feb 2011 22:11:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Feb 2011 22:11:19 -0000 Received: (qmail 2573 invoked by uid 500); 8 Feb 2011 22:11:19 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 2366 invoked by uid 500); 8 Feb 2011 22:11:18 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 2344 invoked by uid 500); 8 Feb 2011 22:11:18 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 2334 invoked by uid 99); 8 Feb 2011 22:11:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Feb 2011 22:11:18 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Feb 2011 22:11:17 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 751DD19A179 for ; Tue, 8 Feb 2011 22:10:57 +0000 (UTC) Date: Tue, 8 Feb 2011 22:10:57 +0000 (UTC) From: "He Yongqiang (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <1692855289.3723.1297203057476.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1009852396.7962.1296758489413.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Updated: (HIVE-1950) Block merge for RCFile MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1950: ------------------------------- Attachment: HIVE-1950.2.patch A new patch addressed the review comments. Will put a few into followup including the stat update. > Block merge for RCFile > ---------------------- > > Key: HIVE-1950 > URL: https://issues.apache.org/jira/browse/HIVE-1950 > Project: Hive > Issue Type: New Feature > Reporter: He Yongqiang > Assignee: He Yongqiang > Attachments: HIVE-1950.1.patch, HIVE-1950.2.patch > > > In our env, there are a lot of small files inside one partition/table. In order to reduce the namenode load, we have one dedicated housekeeping job running to merge these file. Right now the merge is an 'insert overwrite' in hive, and requires decompress the data and compress it. This jira is to add a command in Hive to do the merge without decompress and recompress the data. > Something like "alter table tbl_name [partition ()] merge files". In this jira the new command will only support RCFile, since there need some new APIs to the fileformat. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira