hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "pengcheng xiong (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-7506) MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)
Date Thu, 24 Jul 2014 20:26:38 GMT

     [ https://issues.apache.org/jira/browse/HIVE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

pengcheng xiong updated HIVE-7506:
----------------------------------

    Description: 
Two motivations:

(1) CBO depends heavily on the statistics of a column in a table (or a partition of a table).
If we would like to test whether CBO chooses the best plan under different statistics, it
would be time consuming if we load the whole table and create the statistics from ground up.

(2) As database runs,  the statistics of a column in a table (or a partition of a table) may
change. We need a way or a mechanism to synchronize. 

We propose the following command to achieve that:

ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE STATISTICS col_statistics
[COMMENT col_comment]




  was:
Two motivations:

(1) CBO depends heavily on the statistics of a column in a table (or a partition of a table).
If we would like to test whether CBO chooses the best plan under different statistics, it
would be time consuming if we load the whole table and create the statistics from ground up.

(2) As database runs,  the statistics of a column in a table (or a partition of a table) may
change. We need a way or a mechanism to synchronize. 

We propose the following command to achieve that:

ALTER TABLE table_name PARTITION partition_spec COLUMN col_name UPDATE STATISTICS col_statistics
[COMMENT col_comment]





> MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or
a partition of a table)
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7506
>                 URL: https://issues.apache.org/jira/browse/HIVE-7506
>             Project: Hive
>          Issue Type: New Feature
>          Components: Database/Schema
>            Reporter: pengcheng xiong
>            Assignee: pengcheng xiong
>            Priority: Critical
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> Two motivations:
> (1) CBO depends heavily on the statistics of a column in a table (or a partition of a
table). If we would like to test whether CBO chooses the best plan under different statistics,
it would be time consuming if we load the whole table and create the statistics from ground
up.
> (2) As database runs,  the statistics of a column in a table (or a partition of a table)
may change. We need a way or a mechanism to synchronize. 
> We propose the following command to achieve that:
> ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE STATISTICS col_statistics
[COMMENT col_comment]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message