hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <>
Subject [jira] Commented: (HIVE-352) Make Hive support column based storage
Date Sun, 19 Apr 2009 07:39:47 GMT


He Yongqiang commented on HIVE-352:

I am not sure. However, i observed that SequenceFile does much better in writing and in comression
ratio if all rows data are the same.
I will post a patch now, although it is not finished. I have only finished adding big data
test and complex column data test. The big data test is added in 2 ways:
1) added a rcfile_bigdata.q and 
2) add some test codes in TestRCFile. And in it there are also comparison code of SequenceFile
and RCFile, RCFile does not perform better in writing and compression ratio,  but much better
in reading.

The test results in previous post is generated by class PerformTestRCFileAndSeqFile.

> Make Hive support column based storage
> --------------------------------------
>                 Key: HIVE-352
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: hive-352-2009-4-15.patch, hive-352-2009-4-16.patch, hive-352-2009-4-17.patch,
HIve-352-draft-2009-03-28.patch, Hive-352-draft-2009-03-30.patch
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message