hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11477) CBO inserts a UDF cast for integer type promotion
Date Thu, 06 Aug 2015 00:08:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659239#comment-14659239
] 

Sergey Shelukhin commented on HIVE-11477:
-----------------------------------------

I seem to recall something about these UDFs. They might be necessary because of type issues
in Calcite, if I remember the same issue... it is very hard to strip them back off because
it's hard to tell apart the existing user casts, and the casts after the plan was changed
by Calcite, in a general case. IIRC the idea that I had was that Calcite would need to have
some form of separate casts that would do the same thing but be distinguishable from regular
casts. Or the functions in RelNode tree would need to be taggable with tags preserved during
transformations. 
Although, maybe it's a different, simpler issue, not sure.

> CBO inserts a UDF cast for integer type promotion
> -------------------------------------------------
>
>                 Key: HIVE-11477
>                 URL: https://issues.apache.org/jira/browse/HIVE-11477
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Pengcheng Xiong
>
> When CBO is enabled, filters which compares tinyint, smallint columns with constant integer
types will insert a UDFToInteger cast for the columns. When CBO is disabled, there is no such
UDF. This behaviour breaks ORC predicate pushdown feature as ORC ignores UDFs in the filters.
> In the following examples column t is tinyint
> {code:title=Explain for select count(*) from orc_ppd where t < -127; (CBO OFF)}
> Filter Operator [FIL_9]
>                            predicate:(t = 125) (type: boolean)
>                            Statistics:Num rows: 1050 Data size: 611757 Basic stats: COMPLETE
Column stats: NONE
>                            TableScan [TS_0]
>                               alias:orc_ppd
>                               Statistics:Num rows: 2100 Data size: 1223514 Basic stats:
COMPLETE Column stats: NONE
> {code}
> {code:title=Explain for select count(*) from orc_ppd where t < -127; (CBO ON)}
> Filter Operator [FIL_10]
>                            predicate:(UDFToInteger(t) < -127) (type: boolean)
>                            Statistics:Num rows: 700 Data size: 407838 Basic stats: COMPLETE
Column stats: NONE
>                            TableScan [TS_0]
>                               alias:orc_ppd
>                               Statistics:Num rows: 2100 Data size: 1223514 Basic stats:
COMPLETE Column stats: NONE
> {code}
> CBO does not insert such cast for non-negative numbers
> {code:title=Explain for select count(*) from orc_ppd where t < 127; (CBO ON)}
> Filter Operator [FIL_10]
>                            predicate:(t < 127) (type: boolean)
>                            Statistics:Num rows: 700 Data size: 407838 Basic stats: COMPLETE
Column stats: NONE
>                            TableScan [TS_0]
>                               alias:orc_ppd
>                               Statistics:Num rows: 2100 Data size: 1223514 Basic stats:
COMPLETE Column stats: NONE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message