Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2EC23DB2E for ; Mon, 26 Nov 2012 17:51:00 +0000 (UTC) Received: (qmail 39563 invoked by uid 500); 26 Nov 2012 17:50:59 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 39510 invoked by uid 500); 26 Nov 2012 17:50:59 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 39310 invoked by uid 500); 26 Nov 2012 17:50:59 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 39295 invoked by uid 99); 26 Nov 2012 17:50:59 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Nov 2012 17:50:59 +0000 Date: Mon, 26 Nov 2012 17:50:59 +0000 (UTC) From: "Pradeep Kamath (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <855059859.23595.1353952259265.JavaMail.jiratomcat@arcas> In-Reply-To: <571010118.15334.1353547138381.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (HIVE-3733) Improve Hive's logic for conditional merge MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated HIVE-3733: --------------------------------- Attachment: HIVE-3733.1.patch.txt Attaching patch to fix the issue (used git diff --no-prefix ...). I tried using "arc diff --jira HIVE-3733" Got : PHP Fatal error: Call to undefined method ArcanistGitAPI::amendGitHeadCommit() in /Users/pradeepk/opensource-hive/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php on line 173 I saw some other references to this error in different JIRAs but no solution suggested - is there a fix for this issue? So I manually uploaded a diff (used git diff ..) to create the review - https://reviews.facebook.net/D6969 > Improve Hive's logic for conditional merge > ------------------------------------------ > > Key: HIVE-3733 > URL: https://issues.apache.org/jira/browse/HIVE-3733 > Project: Hive > Issue Type: Improvement > Reporter: Pradeep Kamath > Assignee: Pradeep Kamath > Attachments: HIVE-3733.1.patch.txt > > > If the config hive.merge.mapfiles is set to true and hive.merge.mapredfiles is set to false then when hive encounters a FileSinkOperator when generating map reduce tasks, it will look at the entire job to see if it has a reducer, if it does it will not merge. Instead it should be check if the FileSinkOperator is a child of the reducer. This means that outputs generated in the mapper will be merged, and outputs generated in the reducer will not be, the intended effect of setting those configs. > Simple repro: > set hive.merge.mapfiles=true; > set hive.merge.mapredfiles=false; > EXPLAIN > FROM > INSERT OVERWRITE TABLE SELECT key, COUNT(*) group by key > INSERT OVERWRITE TABLE SELECT *; > The output should contain a Conditional Operator, Mapred Stages, and Move tasks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira