hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-352) Make Hive support column based storage
Date Fri, 27 Mar 2009 03:03:50 GMT

    [ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689794#action_12689794

He Yongqiang commented on HIVE-352:

Thanks, Raghotham Murthy.
Besides these two posts, there are also several useful papers,like
C-Store: A Column-oriented DBMS  
Column-Stores vs. Row-Stores- How Different Are They Really-sigmod08
A Comparison of C-Store and Row-Store in a Common Framework
Materialization Strategies in a Column-Oriented DBMS.
Integrating compression and execution in column-oriented database systems

In these papers, which are written mostly(all?) by people in vertica, they place most emphasis
on the column-oriented execution layer together with a column storage layer. I totally agree
with these opinions. And actually we observed that operators with map-reduce approach have
many differences with the ones implemented in systems like CStore.  And we also found that
bitmap compression can extremely reduce the execution time.
So i guess we can first try to support a column storage layer, and then we can add some column
oriented operators and column-specific compression algorithms.
I will try to provide a small prototype of the storage layer as soon as possible.

> Make Hive support column based storage
> --------------------------------------
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message