Return-Path: Delivered-To: apmail-hive-dev-archive@www.apache.org Received: (qmail 93883 invoked from network); 15 Mar 2011 16:29:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Mar 2011 16:29:54 -0000 Received: (qmail 48143 invoked by uid 500); 15 Mar 2011 16:29:53 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 48118 invoked by uid 500); 15 Mar 2011 16:29:53 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 48110 invoked by uid 500); 15 Mar 2011 16:29:53 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 48107 invoked by uid 99); 15 Mar 2011 16:29:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Mar 2011 16:29:53 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Mar 2011 16:29:51 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C1AEE3AAF0A for ; Tue, 15 Mar 2011 16:29:29 +0000 (UTC) Date: Tue, 15 Mar 2011 16:29:29 +0000 (UTC) From: "Amareshwari Sriramadasu (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <493249320.3990.1300206569789.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <851514033.3984.1300206449788.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Commented: (HIVE-2056) Generate single MR job for multi groupby query. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HIVE-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006995#comment-13006995 ] Amareshwari Sriramadasu commented on HIVE-2056: ----------------------------------------------- Here is a request from one of our customers: here is a real example of need to have multi group by with 1 M/R. If you look at the query below, we have two aggregates being generated out of single fact table. The 1st aggregate generates unique count by date and the 2nd one generates unique count by date and gender. We have lot of these aggregates to be built. We would like this to be done in 1 M/R job as against three below. Is it possible to do this in Hive? // created two intermediate tables hive> create table test_1 (dt string, bc_cnt bigint); OK Time taken: 9.004 seconds hive> create table test_2 (dt string, gender string, bc_cnt bigint); OK // multi group by in insert statement hive> from fact_table f > insert overwrite table test_1 select dt, count(distinct id) group by dt > insert overwrite table test_2 select dt,gender,count(distinct id) group by dt,gender; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks not specified. Estimated from input data size: 999 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapred.reduce.tasks= Thanks Sudhish > Generate single MR job for multi groupby query. > ----------------------------------------------- > > Key: HIVE-2056 > URL: https://issues.apache.org/jira/browse/HIVE-2056 > Project: Hive > Issue Type: Improvement > Reporter: Amareshwari Sriramadasu > Assignee: Amareshwari Sriramadasu > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira