hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt McCline (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
Date Thu, 10 Dec 2015 03:03:11 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049926#comment-15049926
] 

Matt McCline commented on HIVE-12435:
-------------------------------------

Ok, VectorUDAFCount.aggregateInputSelection is looking at scratch column 3 as input.
isRepeating = false, noNulls = true, vector = {1, 0, 0, 0, 0}

But there should be NULLs!

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization
is enabled.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-12435
>                 URL: https://issues.apache.org/jira/browse/HIVE-12435
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 2.0.0
>            Reporter: Takahiko Saito
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: vector_select_null2.q
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4',
false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1	true
> key2	false
> key3	NULL
> key4	false
> key5	NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok
FROM count_case_groupby GROUP BY key;
> key1	1
> key2	1
> key3	1
> key4	1
> key5	1
> {noformat}
> while it expects the following results:
> {noformat}
> key1	1
> key2	1
> key3	0
> key4	1
> key5	0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc format.
> Also even if it's an orc table, when vectorization is disabled, the query works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message