hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongzhi Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11604) HIVE return wrong results in some queries with PTF function
Date Fri, 21 Aug 2015 14:59:45 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706832#comment-14706832
] 

Yongzhi Chen commented on HIVE-11604:
-------------------------------------

My fix will change query plan a little bit in some cases(especially wrong results return without
the fixes cases). 
The failure is because of an extra select operator in the query plan, it should be no harm:
it does correct RS_5's output column
to the right sequence, although it might not be so  important in this scenario. 

Attach second patch to change the test output, and add more test cases which succeed in master
even without my fix to catch possible regressions in the future. 

In current master:
This query returns wrong results:
{noformat}
select ddd.id, ddd.fkey, aaa.name
from (
    select id, fkey, 
    row_number() over (partition by id, fkey) as rnum
    from tlb1 group by id, fkey
 ) ddd 
inner join tlb2 aaa on aaa.fid = ddd.fkey;
{noformat}

while following returns right value(only different is the extra ddd.rnum):
{noformat}
select ddd.id, ddd.fkey, aaa.name, ddd.rnum
from (
    select id, fkey, 
    row_number() over (partition by id, fkey) as rnum
    from tlb1 group by id, fkey
 ) ddd 
inner join tlb2 aaa on aaa.fid = ddd.fkey;
{noformat}


> HIVE return wrong results in some queries with PTF function
> -----------------------------------------------------------
>
>                 Key: HIVE-11604
>                 URL: https://issues.apache.org/jira/browse/HIVE-11604
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 1.2.0, 1.1.0, 2.0.0
>            Reporter: Yongzhi Chen
>            Assignee: Yongzhi Chen
>         Attachments: HIVE-11604.1.patch, HIVE-11604.2.patch
>
>
> Following query returns empty result which is not right:
> {noformat}
> select ddd.id, ddd.fkey, aaa.name
> from (
>     select id, fkey, 
>     row_number() over (partition by id, fkey) as rnum
>     from tlb1 group by id, fkey
>  ) ddd 
> inner join tlb2 aaa on aaa.fid = ddd.fkey;
> {noformat}
> After remove row_number() over (partition by id, fkey) as rnum from query, the right
result returns.
> Reproduce:
> {noformat}
> create table tlb1 (id int, fkey int, val string);
> create table tlb2 (fid int, name string);
> insert into table tlb1 values(100,1,'abc');
> insert into table tlb1 values(200,1,'efg');
> insert into table tlb2 values(1, 'key1');
> select ddd.id, ddd.fkey, aaa.name
> from (
>     select id, fkey, 
>     row_number() over (partition by id, fkey) as rnum
>     from tlb1 group by id, fkey
>  ) ddd 
> inner join tlb2 aaa on aaa.fid = ddd.fkey;
> ....
> INFO  : Ended Job = job_local1070163923_0017
> +---------+-----------+-----------+--+
> No rows selected (14.248 seconds)
> | ddd.id  | ddd.fkey  | aaa.name  |
> +---------+-----------+-----------+--+
> +---------+-----------+-----------+--+
> 0: jdbc:hive2://localhost:10000> select ddd.id, ddd.fkey, aaa.name
> from (
>     select id, fkey 
>     from tlb1 group by id, fkey
>  ) ddd 
> inner join tlb2 aaa on aaa.fid = ddd.fkey;select ddd.id, ddd.fkey, aaa.name
> 0: jdbc:hive2://localhost:10000> from (
> 0: jdbc:hive2://localhost:10000>     select id, fkey 
> 0: jdbc:hive2://localhost:10000>     from tlb1 group by id, fkey
> 0: jdbc:hive2://localhost:10000>  ) ddd 
> 0: jdbc:hive2://localhost:10000> 
> inner join tlb2 aaa on aaa.fid = ddd.fkey;
> INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
> ...
> INFO  : Ended Job = job_local672340505_0019
> +---------+-----------+-----------+--+
> 2 rows selected (14.383 seconds)
> | ddd.id  | ddd.fkey  | aaa.name  |
> +---------+-----------+-----------+--+
> | 100     | 1         | key1      |
> | 200     | 1         | key1      |
> +---------+-----------+-----------+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message