hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Biao Wu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-15848) count or sum distinct incorrect when hive.optimize.reducededuplication set to true
Date Wed, 08 Feb 2017 08:19:41 GMT
Biao Wu created HIVE-15848:
------------------------------

             Summary: count or sum distinct incorrect when hive.optimize.reducededuplication
set to true
                 Key: HIVE-15848
                 URL: https://issues.apache.org/jira/browse/HIVE-15848
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.13.0
            Reporter: Biao Wu
            Priority: Critical


Test Table:
{code:sql}
create table test(id int,key int,name int);
{code}
Data:
||id||key||name||
|1	|1	|2
|1	|2	|3
|1	|3	|2
|1	|4	|2
|1	|5	|3

Test SQL1:
{code:sql}
select id,count(Distinct key),count(Distinct name)
from (select id,key,name from count_distinct_test group by id,key,name)m
group by id;
{code}

result:

|1|5|4

expect:
|1|5|2

Test SQL2:
{code:sql}
select id,count(Distinct name),count(Distinct key)
from (select id,key,name from count_distinct_test group by id,name,key)m
group by id;
{code}

result:
|1|2|5





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message