hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vineet Garg (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-20366) TPC-DS query78 stats estimates are off for is null filter
Date Sun, 19 Aug 2018 01:02:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vineet Garg updated HIVE-20366:
-------------------------------
    Attachment: HIVE-20366.4.patch

> TPC-DS query78 stats estimates are off for is null filter
> ---------------------------------------------------------
>
>                 Key: HIVE-20366
>                 URL: https://issues.apache.org/jira/browse/HIVE-20366
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>            Priority: Major
>         Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch, HIVE-20366.3.patch, HIVE-20366.4.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales LOJ store_returns,
catalog_sales LOJ catalog_returns and web_sales LOJ web_returns. Each of these joins estimates
only a single row and the result is BROADCAST and causes hash table memory errors
> {code}
>          Reducer 12                                 |
> |             Execution mode: vectorized, llap       |
> |             Reduce Operator Tree:                  |
> +----------------------------------------------------+
> |                      Explain                       |
> +----------------------------------------------------+
> |               Map Join Operator                    |
> |                 condition map:                     |
> |                      Left Outer Join 0 to 1        |
> |                 keys:                              |
> |                   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: bigint)
|
> |                   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: bigint)
|
> |                 outputColumnNames: _col0, _col1, _col3, _col4, _col5, _col6, _col8
|
> |                 input vertices:                    |
> |                   1 Map 14                         |
> |                 Statistics: Num rows: 10282477384 Data size: 534184867432 Basic stats:
COMPLETE Column stats: COMPLETE |
> |                 Filter Operator                    |
> |                   predicate: _col8 is null (type: boolean) |
> |                  * Statistics: Num rows: 1* Data size: 52 Basic stats: COMPLETE Column
stats: COMPLETE |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message