Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hive-dev@hadoop.apache.org
Message-ID: <864933109.1240226627506.JavaMail.jira@brutus>
Date: Mon, 20 Apr 2009 04:23:47 -0700 (PDT)
From: "He Yongqiang (JIRA)" <jira@apache.org>
To: hive-dev@hadoop.apache.org
Subject: [jira] Commented: (HIVE-352) Make Hive support column based storage
In-Reply-To: <1379209356.1237274210500.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700765#action_12700765 ] 

He Yongqiang commented on HIVE-352:
-----------------------------------

More explaination to the read sharp decrease problem:
In our test, we use string columns. the data is randomly produced.
When column number is only a few, and the buffer size is 4M default. So every column's buffer is more than TCP_WINDOW_SIZE. So when skipping columns, the if block will not executed. But when columns are getting more, the buffer size each column can get become less. And finally, most columns' buffer is less than TCP_WINDOW_SIZE. So the sharp decrease problem appears.

> Make Hive support column based storage
> --------------------------------------
>
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: hive-352-2009-4-15.patch, hive-352-2009-4-16.patch, hive-352-2009-4-17.patch, hive-352-2009-4-19.patch, HIve-352-draft-2009-03-28.patch, Hive-352-draft-2009-03-30.patch
>
>
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will need some review and refactoring to port it to Hive.
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.