spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-8312) Populate statistics info of hive tables if it's needed to be
Date Thu, 11 Jun 2015 21:40:01 GMT

     [ https://issues.apache.org/jira/browse/SPARK-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-8312:
-----------------------------------

    Assignee:     (was: Apache Spark)

> Populate statistics info of hive tables if it's needed to be
> ------------------------------------------------------------
>
>                 Key: SPARK-8312
>                 URL: https://issues.apache.org/jira/browse/SPARK-8312
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Navis
>            Priority: Minor
>
> Currently, spark-sql uses stats in metastore for estimating size of hive table, which
means analyze command should be executed before accessing the table for better planning especially
for joins. But still with the stats, it cannot reflect real input size of the query when partition
prunning predicate exists in it.
> Even worse is that hive cannot update metastore stats for external tables, which is fixed
recently in HIVE-6727. The issue detail says the bug is applied to all hive version between
0.13.0 and 1.2.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message