hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-5483) use metastore statistics to optimize max/min/etc. queries
Date Mon, 07 Oct 2013 22:33:41 GMT
Sergey Shelukhin created HIVE-5483:
--------------------------------------

             Summary: use metastore statistics to optimize max/min/etc. queries
                 Key: HIVE-5483
                 URL: https://issues.apache.org/jira/browse/HIVE-5483
             Project: Hive
          Issue Type: Improvement
            Reporter: Sergey Shelukhin


We have discussed this a little bit.
Hive can answer queries such as select max(c1) from t purely from metastore using partition
statistics, provided that we know the statistics are up to date.
All data changes (e.g. adding new partitions) currently go thru metastore so we can track
up-to-date-ness. If they are not up-to-date, the queries will have to read data (at least
for outdated partitions) until someone runs analyze table. We can also analyze new partitions
after add, if that is configured/specified in the command.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message