hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-352) Make Hive support column based storage
Date Fri, 27 Mar 2009 03:21:50 GMT

    [ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689796#action_12689796

He Yongqiang commented on HIVE-352:

Also the cost of tuple reconstruction accounts for a large proportion of the whole execution
time. In our initial exprements, the reconstruction cost is much higher than the benefit of
intergreting the column-execution and the underlying column-storage. The reconstruction is
a Map-Reduce join operation. The cost can be extremely reduced in some queries when we can
reduce the number of tuples needed to reconstruct. The key to this is a late materialization.
But in the current B2.2, the localize rows in a single file and adopt a record-level columnar
storage, it does not have the tuple reconstruction cost. But it needs a more specific and
more flexble compression algorithms, and i strongly recommed to support bitmap file in future.
As the main benefit of a columnar strategy, it needs us to add some columnar operators in
the next.
But now let us make the first step, and then add more optimizations.

> Make Hive support column based storage
> --------------------------------------
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message