hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7506) MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)
Date Thu, 14 Aug 2014 07:19:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096692#comment-14096692
] 

Lefty Leverenz commented on HIVE-7506:
--------------------------------------

This should be documented in a new subsection of the DDL doc's "Alter Either Table or Partition"
and also in the Statistics doc with examples, release information, a link back to this jira,
and links between the two docs:

* [DDL -- Alter Either Table or Partition | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterEitherTableorPartition]
* [Statistics in Hive | https://cwiki.apache.org/confluence/display/Hive/StatsDev]

A release note would also be good.

> MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or
a partition of a table)
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7506
>                 URL: https://issues.apache.org/jira/browse/HIVE-7506
>             Project: Hive
>          Issue Type: New Feature
>          Components: Statistics
>            Reporter: pengcheng xiong
>            Assignee: pengcheng xiong
>            Priority: Minor
>              Labels: TODOC14
>             Fix For: 0.14.0
>
>         Attachments: HIVE-7506.1.patch, HIVE-7506.3.patch, HIVE-7506.4.patch, HIVE-7506.5.patch,
HIVE-7506.6.patch, HIVE-7506.7.patch, HIVE-7506.8.patch, HIVE-7506.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> Two motivations:
> (1) Cost-based Optimizer (CBO) depends heavily on the statistics of a column in a table
(or a partition of a table). If we would like to test whether CBO chooses the best plan under
different statistics, it would be time consuming if we load the whole table and create the
statistics from ground up.
> (2) As database runs,  the statistics of a column in a table (or a partition of a table)
may change. We need a way or a mechanism to synchronize. 
> We propose the following command to achieve that:
> ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE STATISTICS col_statistics
[COMMENT col_comment]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message