hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shreepadma Venugopalan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-3516) Fast incremental statistics computation on columns in Hive tables
Date Fri, 02 Nov 2012 23:52:11 GMT

     [ https://issues.apache.org/jira/browse/HIVE-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shreepadma Venugopalan updated HIVE-3516:
-----------------------------------------

    Summary: Fast incremental statistics computation on columns in Hive tables  (was: Fast
incremental statistics computation on column in Hive tables)
    
> Fast incremental statistics computation on columns in Hive tables
> -----------------------------------------------------------------
>
>                 Key: HIVE-3516
>                 URL: https://issues.apache.org/jira/browse/HIVE-3516
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>            Reporter: Shreepadma Venugopalan
>            Assignee: Shreepadma Venugopalan
>
> Statistics computed on Hive columns in partition can be rolled up to avoid scanning the
table again to compute column statistics at the table(global) level. While its straightforward
to roll up some statistics such as max, min, avgcollen, maxcollen etc, rolling up other statistics
such as ndv requires maintaining intermediate state. This ticket covers the task of a) maintaining
the necessary intermediate state needed to roll up partition level statistics b) detecting
that the partition level statistics can be rolled up and actually computing table level statistics
from partition level statistics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message