hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Remus Rusanu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-16757) Use memoization in HiveRelMdRowCount.getRowCount
Date Thu, 25 May 2017 13:01:04 GMT

     [ https://issues.apache.org/jira/browse/HIVE-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Remus Rusanu updated HIVE-16757:
--------------------------------
    Attachment: HIVE-16757.01.patch

Patch 01 for running the tests 

> Use memoization in HiveRelMdRowCount.getRowCount
> ------------------------------------------------
>
>                 Key: HIVE-16757
>                 URL: https://issues.apache.org/jira/browse/HIVE-16757
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>         Attachments: HIVE-16757.01.patch
>
>
> On complex queries HiveRelMdRowCount.getRowCount can get called many times. since it
does not memoize its result and the call is recursive, it results in an explosion of calls.
for example a query with 49 joins, during join ordering (LoptOtimizerJoinRule) the HiveRelMdRowCount.getRowCount
gets called 6442 as a top level call, but the recursivity exploded this to 501729 calls. Memoization
of the rezult would stop the recursion early. In my testing this reduced the join reordering
time for said query from 11s to <1s..



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message