Return-Path: Delivered-To: apmail-hadoop-hive-dev-archive@minotaur.apache.org Received: (qmail 8510 invoked from network); 20 Apr 2009 11:24:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Apr 2009 11:24:10 -0000 Received: (qmail 33254 invoked by uid 500); 20 Apr 2009 11:24:10 -0000 Delivered-To: apmail-hadoop-hive-dev-archive@hadoop.apache.org Received: (qmail 33194 invoked by uid 500); 20 Apr 2009 11:24:10 -0000 Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hadoop.apache.org Delivered-To: mailing list hive-dev@hadoop.apache.org Received: (qmail 33184 invoked by uid 99); 20 Apr 2009 11:24:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Apr 2009 11:24:10 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Apr 2009 11:24:08 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7F18E234C004 for ; Mon, 20 Apr 2009 04:23:47 -0700 (PDT) Message-ID: <864933109.1240226627506.JavaMail.jira@brutus> Date: Mon, 20 Apr 2009 04:23:47 -0700 (PDT) From: "He Yongqiang (JIRA)" To: hive-dev@hadoop.apache.org Subject: [jira] Commented: (HIVE-352) Make Hive support column based storage In-Reply-To: <1379209356.1237274210500.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700765#action_12700765 ] He Yongqiang commented on HIVE-352: ----------------------------------- More explaination to the read sharp decrease problem: In our test, we use string columns. the data is randomly produced. When column number is only a few, and the buffer size is 4M default. So every column's buffer is more than TCP_WINDOW_SIZE. So when skipping columns, the if block will not executed. But when columns are getting more, the buffer size each column can get become less. And finally, most columns' buffer is less than TCP_WINDOW_SIZE. So the sharp decrease problem appears. > Make Hive support column based storage > -------------------------------------- > > Key: HIVE-352 > URL: https://issues.apache.org/jira/browse/HIVE-352 > Project: Hadoop Hive > Issue Type: New Feature > Reporter: He Yongqiang > Assignee: He Yongqiang > Attachments: hive-352-2009-4-15.patch, hive-352-2009-4-16.patch, hive-352-2009-4-17.patch, hive-352-2009-4-19.patch, HIve-352-draft-2009-03-28.patch, Hive-352-draft-2009-03-30.patch > > > column based storage has been proven a better storage layout for OLAP. > Hive does a great job on raw row oriented storage. In this issue, we will enhance hive to support column based storage. > Acctually we have done some work on column based storage on top of hdfs, i think it will need some review and refactoring to port it to Hive. > Any thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.