Return-Path: Delivered-To: apmail-hive-dev-archive@www.apache.org Received: (qmail 9299 invoked from network); 7 Mar 2011 18:57:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Mar 2011 18:57:21 -0000 Received: (qmail 98128 invoked by uid 500); 7 Mar 2011 18:57:21 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 98097 invoked by uid 500); 7 Mar 2011 18:57:21 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 97978 invoked by uid 500); 7 Mar 2011 18:57:21 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 97941 invoked by uid 99); 7 Mar 2011 18:57:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Mar 2011 18:57:21 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Mar 2011 18:57:20 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 0738939B5A4 for ; Mon, 7 Mar 2011 18:57:00 +0000 (UTC) Date: Mon, 7 Mar 2011 18:57:00 +0000 (UTC) From: "Carl Steinbach (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <381372548.1861.1299524220026.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Resolved: (HIVE-1723) The result of left semi join is not correct MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-1723. ---------------------------------- Resolution: Duplicate > The result of left semi join is not correct > ------------------------------------------- > > Key: HIVE-1723 > URL: https://issues.apache.org/jira/browse/HIVE-1723 > Project: Hive > Issue Type: Bug > Reporter: Liyin Tang > Assignee: Liyin Tang > > In the test case semijoin.q, there is a query: > select /*+ mapjoin(b) */ a.key from t3 a left semi join t1 b on a.key = b.key sort by a.key; > I think this query will return a wrong result if table t1 is larger than 25000 different keys > To be simple, I tried a very similar query: > select /*+ mapjoin(b) */ a.key from test_semijoin a left semi join test_semijoin b on a.key = b.key sort by a.key; > The table of test_semijoin is like > 0 0 > 1 1 > 2 2 > 3 3 > 4 4 > 5 5 > ... ... > ... .... > 25000 25000 > 25001 25001 > ... .... > ... .... > 25999 25999 > 26000 26000 > So we can easily estimate the correct result of this query should be the same keys from table test_semijoin itsel. > Actually, the result is only part of that: only from 0 to 24544. > 0 > 1 > 2 > .. > .. > 24543 > 24544 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira