hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <>
Subject [jira] Commented: (HIVE-352) Make Hive support column based storage
Date Mon, 20 Apr 2009 11:23:47 GMT


He Yongqiang commented on HIVE-352:

More explaination to the read sharp decrease problem:
In our test, we use string columns. the data is randomly produced.
When column number is only a few, and the buffer size is 4M default. So every column's buffer
is more than TCP_WINDOW_SIZE. So when skipping columns, the if block will not executed. But
when columns are getting more, the buffer size each column can get become less. And finally,
most columns' buffer is less than TCP_WINDOW_SIZE. So the sharp decrease problem appears.

> Make Hive support column based storage
> --------------------------------------
>                 Key: HIVE-352
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: hive-352-2009-4-15.patch, hive-352-2009-4-16.patch, hive-352-2009-4-17.patch,
hive-352-2009-4-19.patch, HIve-352-draft-2009-03-28.patch, Hive-352-draft-2009-03-30.patch
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message