hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9347) Bug with max() together with rank() and grouping sets
Date Mon, 19 Jan 2015 08:03:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282238#comment-14282238
] 

Ashutosh Chauhan commented on HIVE-9347:
----------------------------------------

+1

> Bug with max() together with rank() and grouping sets
> -----------------------------------------------------
>
>                 Key: HIVE-9347
>                 URL: https://issues.apache.org/jira/browse/HIVE-9347
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.14.0, 0.13.1
>         Environment: Amazon Elastic Map Reduce, AMI 3.3.1, Hadoop Amazon 2.4.0, Hive
0.13.1
>            Reporter: Michal Krawczyk
>            Assignee: Navis
>         Attachments: HIVE-9347.1.patch.txt, HIVE-9347.2.patch.txt, HIVE-9347.3.patch.txt
>
>
> It looks like the query below returns incorrect results on Hive 0.13.1, but it was working
fine on Hive 0.11. 
> I have the following table:
> CREATE  TABLE `t`(
>   `category` int, 
>   `live` int, 
>   `comments` int)
> with the following data:
> hive> select * from t;
> OK
> 3       0       2
> 2       0       2
> 8       0       2
> The query:
> hive> select category, max(live) live, max(comments) comments, rank() OVER (PARTITION
BY category ORDER BY comments) rank1
> FROM t
> GROUP BY category
> GROUPING SETS ((), (category))
> HAVING max(comments) > 0;
> return the following results:
> NULL    1       48      1
> 2       1       49      1
> 3       1       49      1
> 8       1       49      1
> When using grouping sets with the rank() function the max() function return incorrect
results. Everything works fine if I remove grouping sets clause and split the query into two
independent queries or remove the rank() function.
> This looks like a bug to me but please review. That said, I'm not sure if it's just Amazon
issue or general Hive issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message