hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Haindrich (JIRA)" <>
Subject [jira] [Updated] (HIVE-17626) Query reoptimization using cached runtime statistics
Date Wed, 07 Mar 2018 08:20:00 GMT


Zoltan Haindrich updated HIVE-17626:
       Resolution: Fixed
    Fix Version/s: 3.0.0
           Status: Resolved  (was: Patch Available)

pushed to master. Thank you Ashutosh for reviewing the changes!

> Query reoptimization using cached runtime statistics
> ----------------------------------------------------
>                 Key: HIVE-17626
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Logical Optimizer
>    Affects Versions: 3.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Zoltan Haindrich
>            Priority: Major
>             Fix For: 3.0.0
>         Attachments: HIVE-17626.01.patch, HIVE-17626.01wip01.patch, HIVE-17626.02.patch,
HIVE-17626.03.patch, HIVE-17626.04.patch, HIVE-17626.05.patch, HIVE-17626.06.patch, HIVE-17626.07A.patch,
HIVE-17626.07B.patch, HIVE-17626.08.patch, HIVE-17626.09.patch, HIVE-17626.10.patch, HIVE-17626.11.patch,
> Something similar to "EXPLAIN ANALYZE" where we annotate explain plan with actual and
estimated statistics. The runtime stats can be cached at query level and subsequent execution
of the same query can make use of the cached statistics from the previous run for better optimization.

> Some use cases,
> 1) re-planning join query (mapjoin failures can be converted to shuffle joins)
> 2) better statistics for table scan operator if dynamic partition pruning is involved
> 3) Better estimates for bloom filter initialization (setting expected entries during
> This can extended to support wider queries by caching fragments of operator plans scanning
same table(s) or matching some operator sequences.

This message was sent by Atlassian JIRA

View raw message