hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sichi (JIRA)" <>
Subject [jira] Commented: (HIVE-1940) Query Optimization Using Column Metadata and Histograms
Date Mon, 07 Feb 2011 19:41:57 GMT


John Sichi commented on HIVE-1940:

Awesome diagram!  Can you add it as an attachment and check the radio button to grant license
to ASF so that we can use it in the Hive wiki?

Try loading some data into your partitions; maybe it deferred that part of the schema creation
until then.

There's a tool which can force generation of the entire schema:

There's an ant target generate-schema which invokes it (in metastore/build.xml), but it's
out-of-date because it still references jpox instead of datanucleus (e.g. it should be invoking instead of org.jpox.SchemaTool).  If you get it working,
submit a patch and we can update it.

> Query Optimization Using Column Metadata and Histograms
> -------------------------------------------------------
>                 Key: HIVE-1940
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor
>            Reporter: Anja Gruenheid
> The current basis for cost-based query optimization in Hive is information gathered on
tables and partitions. To make further improvements in query optimization possible, the next
step is to develop and implement possibilities to gather information on columns as discussed
in issue HIVE-33. After that, an implementation of histograms is a possible option to use
and collect run-time statistics. Next to the actual implementation of these features, it is
also necessary to develop a consistent storage model for the MetaStore.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message