hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sichi (JIRA)" <>
Subject [jira] Commented: (HIVE-1940) Query Optimization Using Column Metadata and Histograms
Date Thu, 03 Feb 2011 19:55:28 GMT


John Sichi commented on HIVE-1940:

Hi Anja,

To get a DDL script, you can install Hive and then get your DBMS to generate a script.  For
example, with MySQL, you can use the mysqldump utility with --no-data option.

For Derby, see

For an E/R diagram, I had good results with the open source tool Power Architect:

(Some manual layout required after reverse engineering.)  You can see an example here:

If you produce a diagram for the complete metastore, we can get it published in the wiki for
others to use.

> Query Optimization Using Column Metadata and Histograms
> -------------------------------------------------------
>                 Key: HIVE-1940
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor
>            Reporter: Anja Gruenheid
> The current basis for cost-based query optimization in Hive is information gathered on
tables and partitions. To make further improvements in query optimization possible, the next
step is to develop and implement possibilities to gather information on columns as discussed
in issue HIVE-33. After that, an implementation of histograms is a possible option to use
and collect run-time statistics. Next to the actual implementation of these features, it is
also necessary to develop a consistent storage model for the MetaStore.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message