Return-Path: Delivered-To: apmail-hadoop-hive-dev-archive@minotaur.apache.org Received: (qmail 83140 invoked from network); 18 Aug 2009 17:38:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Aug 2009 17:38:17 -0000 Received: (qmail 66169 invoked by uid 500); 18 Aug 2009 17:38:36 -0000 Delivered-To: apmail-hadoop-hive-dev-archive@hadoop.apache.org Received: (qmail 66112 invoked by uid 500); 18 Aug 2009 17:38:36 -0000 Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hadoop.apache.org Delivered-To: mailing list hive-dev@hadoop.apache.org Received: (qmail 66102 invoked by uid 99); 18 Aug 2009 17:38:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Aug 2009 17:38:36 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Aug 2009 17:38:35 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id CB8BC234C044 for ; Tue, 18 Aug 2009 10:38:14 -0700 (PDT) Message-ID: <1182002601.1250617094817.JavaMail.jira@brutus> Date: Tue, 18 Aug 2009 10:38:14 -0700 (PDT) From: "Zheng Shao (JIRA)" To: hive-dev@hadoop.apache.org Subject: [jira] Resolved: (HIVE-759) add hive.intermediate.compression.codec option In-Reply-To: <1894859766.1250382194787.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HIVE-759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao resolved HIVE-759. ----------------------------- Resolution: Fixed Fix Version/s: 0.5.0 Release Note: HIVE-759. Add "hive.intermediate.compression.codec/type" option. (Yongqiang He via zshao) Hadoop Flags: [Reviewed] Committed. Thanks Yongqiang. > add hive.intermediate.compression.codec option > ---------------------------------------------- > > Key: HIVE-759 > URL: https://issues.apache.org/jira/browse/HIVE-759 > Project: Hadoop Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Zheng Shao > Assignee: He Yongqiang > Fix For: 0.5.0 > > Attachments: hive-759-2009-08-17.patch, hive-759-2009-08-18-2.patch, hive-759-2009-08-18.patch > > > Hive uses the jobconf compression codec for all map-reduce jobs. This includes both mapred.map.output.compression.codec and mapred.output.compression.codec. > In some cases, we want to distinguish between the codec used for intermediate map-reduce jobs (that produces intermediate data between jobs) and the final map-reduce jobs (that produces data stored in tables). > For intermediate data, lzo might be a better fit because it's much faster; for final data, gzip might be a better fit because it saves disk spaces. > We should introduce two new options: > {code} > hive.intermediate.compression.codec=org.apache.hadoop.io.compress.LzoCodec > hive.intermediate.compression.type=BLOCK > {code} > And use these 2 options to override the mapred.output.compression.* in the FileSinkOperator that produces intermediate data. > Note that it's possible that a single map-reduce job may have 2 FileSInkOperators: one produces intermediate data, and one produces final data. So we need to add a flag to fileSinkDesc for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.