Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D85F318A9A for ; Fri, 21 Aug 2015 11:49:45 +0000 (UTC) Received: (qmail 95915 invoked by uid 500); 21 Aug 2015 11:49:45 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 95889 invoked by uid 500); 21 Aug 2015 11:49:45 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 95879 invoked by uid 99); 21 Aug 2015 11:49:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Aug 2015 11:49:45 +0000 Date: Fri, 21 Aug 2015 11:49:45 +0000 (UTC) From: "Yongzhi Chen (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-11604) HIVE return wrong results in some queries with PTF function MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706577#comment-14706577 ] Yongzhi Chen commented on HIVE-11604: ------------------------------------- [~jcamachorodriguez], the problem is reproducible in master (My master is around 1 week old though). The fix follows the similar pattern in other cases in ProjectRemover, for example: {noformat} Operator parent = parents.get(0); if (parent instanceof ReduceSinkOperator && Iterators.any(sel.getChildOperators().iterator(), Predicates.instanceOf(ReduceSinkOperator.class))) { // For RS-SEL-RS case. reducer operator in reducer task cannot be null in task compiler return null; } {noformat} For the PTF case, it need a select operator follows it. We can add other cases before return null if they fall into the same category. > HIVE return wrong results in some queries with PTF function > ----------------------------------------------------------- > > Key: HIVE-11604 > URL: https://issues.apache.org/jira/browse/HIVE-11604 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer > Affects Versions: 1.2.0, 1.1.0 > Reporter: Yongzhi Chen > Assignee: Yongzhi Chen > Attachments: HIVE-11604.1.patch > > > Following query returns empty result which is not right: > {noformat} > select ddd.id, ddd.fkey, aaa.name > from ( > select id, fkey, > row_number() over (partition by id, fkey) as rnum > from tlb1 group by id, fkey > ) ddd > inner join tlb2 aaa on aaa.fid = ddd.fkey; > {noformat} > After remove row_number() over (partition by id, fkey) as rnum from query, the right result returns. > Reproduce: > {noformat} > create table tlb1 (id int, fkey int, val string); > create table tlb2 (fid int, name string); > insert into table tlb1 values(100,1,'abc'); > insert into table tlb1 values(200,1,'efg'); > insert into table tlb2 values(1, 'key1'); > select ddd.id, ddd.fkey, aaa.name > from ( > select id, fkey, > row_number() over (partition by id, fkey) as rnum > from tlb1 group by id, fkey > ) ddd > inner join tlb2 aaa on aaa.fid = ddd.fkey; > .... > INFO : Ended Job = job_local1070163923_0017 > +---------+-----------+-----------+--+ > No rows selected (14.248 seconds) > | ddd.id | ddd.fkey | aaa.name | > +---------+-----------+-----------+--+ > +---------+-----------+-----------+--+ > 0: jdbc:hive2://localhost:10000> select ddd.id, ddd.fkey, aaa.name > from ( > select id, fkey > from tlb1 group by id, fkey > ) ddd > inner join tlb2 aaa on aaa.fid = ddd.fkey;select ddd.id, ddd.fkey, aaa.name > 0: jdbc:hive2://localhost:10000> from ( > 0: jdbc:hive2://localhost:10000> select id, fkey > 0: jdbc:hive2://localhost:10000> from tlb1 group by id, fkey > 0: jdbc:hive2://localhost:10000> ) ddd > 0: jdbc:hive2://localhost:10000> > inner join tlb2 aaa on aaa.fid = ddd.fkey; > INFO : Number of reduce tasks not specified. Estimated from input data size: 1 > ... > INFO : Ended Job = job_local672340505_0019 > +---------+-----------+-----------+--+ > 2 rows selected (14.383 seconds) > | ddd.id | ddd.fkey | aaa.name | > +---------+-----------+-----------+--+ > | 100 | 1 | key1 | > | 200 | 1 | key1 | > +---------+-----------+-----------+--+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)