Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 224C1D537 for ; Mon, 27 Aug 2012 16:45:12 +0000 (UTC) Received: (qmail 12623 invoked by uid 500); 27 Aug 2012 16:45:11 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 12548 invoked by uid 500); 27 Aug 2012 16:45:11 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 12537 invoked by uid 500); 27 Aug 2012 16:45:11 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 12534 invoked by uid 99); 27 Aug 2012 16:45:11 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Aug 2012 16:45:11 +0000 Date: Tue, 28 Aug 2012 03:45:11 +1100 (NCT) From: "Namit Jain (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <1939626739.1925.1346085911138.JavaMail.jiratomcat@arcas> In-Reply-To: <1590697659.40159.1340302243241.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442517#comment-13442517 ] Namit Jain commented on HIVE-3171: ---------------------------------- I know the policy, but in most of the cases, I think we don't commit our own patches. Anyway, it is not a big deal - I don't feel very comfortable about it, but have no reservations if you want to take that path. > Bucketed sort merge join doesn't work when multiple files exist for small alias > ------------------------------------------------------------------------------- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.10.0 > Reporter: Joey Echeverria > Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the table. However, if you add a second partition, Hive attempts to do a regular map-side join which can fail because the tables are too large. Hive ought to be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira