hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetna Chaudhari (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11735) Different results when multiple if() functions are used
Date Sun, 06 Sep 2015 10:45:45 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732343#comment-14732343
] 

Chetna Chaudhari commented on HIVE-11735:
-----------------------------------------

Issue is occurring while generating OPTree . 
Causes:
1) lowercase conversion of table and column aliases in RowResolver class.
2) Also in genSelectPlan, the aggregations are first converted to lowercase and then added
to aggregations map.

Removing the toLowerCase() call from mentioned places resolved the issue for group by aggregations.

Note this issue will happen in case of all aggregations, joins and select, the same logic
is there for other operators too.
Would like to contribute the patch if everyone agrees on the change. 

> Different results when multiple if() functions are used 
> --------------------------------------------------------
>
>                 Key: HIVE-11735
>                 URL: https://issues.apache.org/jira/browse/HIVE-11735
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Chetna Chaudhari
>            Assignee: Chetna Chaudhari
>
> Hive if() udf is returns different results when string equality is used as condition,
with case change. 
> Observation:
>    1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are treated as
equal.
>    2) The rightmost udf result is pushed to predicates on left side. Leading to same
result for both the udfs.
> How to reproduce the issue:
> 1) CREATE TABLE `sample`(
>   `name` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1425075745');
> 2) insert into table sample values ('chetna');
> 3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3))  from sample;

>     This will give result : 
>     3    3
>     Expected result:
>     4    3
> 4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3))  from sample;

>     This will give result 
>     4    4
>     Expected result:
>     3    4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message