hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5657) TopN produces incorrect results with count(distinct)
Date Thu, 31 Oct 2013 00:13:26 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809785#comment-13809785
] 

Sergey Shelukhin commented on HIVE-5657:
----------------------------------------

Left some comments on fb. Can you summarize how the change works? it seems like with any distincts,
it will just forward all rows (like if it was disabled), but there is lots of code to do it.

> TopN produces incorrect results with count(distinct)
> ----------------------------------------------------
>
>                 Key: HIVE-5657
>                 URL: https://issues.apache.org/jira/browse/HIVE-5657
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Navis
>            Priority: Critical
>         Attachments: D13797.1.patch, example.patch, HIVE-5657.1.patch.txt
>
>
> Attached patch illustrates the problem.
> limit_pushdown test has various other cases of aggregations and distincts, incl. count-distinct,
that work correctly (that said, src dataset is bad for testing these things because every
count, for example, produces one record only), so something must be special about this.
> I am not very familiar with distinct- code and these nuances; if someone knows a quick
fix feel free to take this, otherwise I will probably start looking next week. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message