Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D1336DFD4 for ; Fri, 2 Nov 2012 23:52:12 +0000 (UTC) Received: (qmail 49901 invoked by uid 500); 2 Nov 2012 23:52:12 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 49772 invoked by uid 500); 2 Nov 2012 23:52:12 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 49763 invoked by uid 500); 2 Nov 2012 23:52:12 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 49760 invoked by uid 99); 2 Nov 2012 23:52:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Nov 2012 23:52:12 +0000 Date: Fri, 2 Nov 2012 23:52:11 +0000 (UTC) From: "Shreepadma Venugopalan (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <1460108134.63649.1351900332101.JavaMail.jiratomcat@arcas> In-Reply-To: <820835128.143603.1348947007829.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (HIVE-3516) Fast incremental statistics computation on columns in Hive tables MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-3516: ----------------------------------------- Summary: Fast incremental statistics computation on columns in Hive tables (was: Fast incremental statistics computation on column in Hive tables) > Fast incremental statistics computation on columns in Hive tables > ----------------------------------------------------------------- > > Key: HIVE-3516 > URL: https://issues.apache.org/jira/browse/HIVE-3516 > Project: Hive > Issue Type: Bug > Components: Statistics > Reporter: Shreepadma Venugopalan > Assignee: Shreepadma Venugopalan > > Statistics computed on Hive columns in partition can be rolled up to avoid scanning the table again to compute column statistics at the table(global) level. While its straightforward to roll up some statistics such as max, min, avgcollen, maxcollen etc, rolling up other statistics such as ndv requires maintaining intermediate state. This ticket covers the task of a) maintaining the necessary intermediate state needed to roll up partition level statistics b) detecting that the partition level statistics can be rolled up and actually computing table level statistics from partition level statistics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira