hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-1699) incorrect partition pruning ANALYZE TABLE
Date Tue, 12 Oct 2010 20:58:33 GMT

     [ https://issues.apache.org/jira/browse/HIVE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ning Zhang updated HIVE-1699:
-----------------------------

    Attachment: HIVE-1699.patch

This patch includes the following changes:

  1) correctly pruning partitions based on the partition specification in ANALYZE TABLE command.
  2) adding a Hive.getPartitionsByNames() method to get a list of partitions based on their
names. Previous we'll have to use Hive.getPartitions which get all partitions as Partition
objects and then filter out partitions that doesn't satisfy spec. This is very expensive for
tables with large number of partitions. This could be further improved by using the partition
filtering pushdown feature once it is fully supported. 
  3) Caching the list of partitions in tableSpec so that StatsTask does not need to get the
list of partitions again. 
  4) adding a explicit variable tableSpec to indicate its type (TABLE_ONLY, STATIC_PARTITION,
DYNAMIC_PARTITION) rather than relying on implicit checking on partHandle. 

> incorrect partition pruning ANALYZE TABLE
> -----------------------------------------
>
>                 Key: HIVE-1699
>                 URL: https://issues.apache.org/jira/browse/HIVE-1699
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-1699.patch
>
>
> If table T is partitioned, ANALYZE TABLE T PARTITION (...) COMPUTE STATISTICS; will gather
stats for all partitions even though partition spec only chooses a subset. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message