hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
Date Wed, 24 Sep 2014 21:51:34 GMT

     [ https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergey Shelukhin updated HIVE-8111:
-----------------------------------
    Attachment: HIVE-8111.03.patch

Test failed due to plan changes. Meanwhile remove cbo flag from the test.

> CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-8111
>                 URL: https://issues.apache.org/jira/browse/HIVE-8111
>             Project: Hive
>          Issue Type: Sub-task
>          Components: CBO
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-8111.01.patch, HIVE-8111.02.patch, HIVE-8111.03.patch, HIVE-8111.patch
>
>
> Original test failure: looks like column type changes to different decimals in most cases.
In one case it causes the integer part to be too big to fit, so the result becomes null it
seems.
> What happens is that CBO adds casts to arithmetic expressions to make them type compatible;
these casts become part of new AST, and then Hive adds casts on top of these casts. This (the
first part) also causes lots of out file changes. It's not clear how to best fix it so far,
in addition to incorrect decimal width and sometimes nulls when width is larger than allowed
in Hive.
> Option one - don't add those for numeric ops - cannot be done if numeric op is a part
of compare, for which CBO needs correct types.
> Option two - unwrap casts when determining type in Hive - hard or impossible to tell
apart CBO-added casts and user casts. 
> Option three - don't change types in Hive if CBO has run - seems hacky and hard to ensure
it's applied everywhere.
> Option four - map all expressions precisely between two trees and remove casts again
after optimization, will be pretty difficult.
> Option five - somehow mark those casts. Not sure about how yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message