Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3043C18FEC for ; Thu, 11 Jun 2015 21:40:01 +0000 (UTC) Received: (qmail 91788 invoked by uid 500); 11 Jun 2015 21:40:01 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 91764 invoked by uid 500); 11 Jun 2015 21:40:01 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 91754 invoked by uid 99); 11 Jun 2015 21:40:01 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Jun 2015 21:40:01 +0000 Date: Thu, 11 Jun 2015 21:40:01 +0000 (UTC) From: "Apache Spark (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (SPARK-8312) Populate statistics info of hive tables if it's needed to be MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-8312: ----------------------------------- Assignee: (was: Apache Spark) > Populate statistics info of hive tables if it's needed to be > ------------------------------------------------------------ > > Key: SPARK-8312 > URL: https://issues.apache.org/jira/browse/SPARK-8312 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Navis > Priority: Minor > > Currently, spark-sql uses stats in metastore for estimating size of hive table, which means analyze command should be executed before accessing the table for better planning especially for joins. But still with the stats, it cannot reflect real input size of the query when partition prunning predicate exists in it. > Even worse is that hive cannot update metastore stats for external tables, which is fixed recently in HIVE-6727. The issue detail says the bug is applied to all hive version between 0.13.0 and 1.2.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org