hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-352) Make Hive support column based storage
Date Mon, 23 Mar 2009 19:57:50 GMT

    [ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688397#action_12688397

Joydeep Sen Sarma commented on HIVE-352:

if u are doing B2.2 - i think it's still pretty easy to make sure that we don't decompress
all columns when we only want a few. using sequencefile record compression - that's what will
happen - and i think the performance gain might be much less (the benefit would be reduced
primarily to better compression of the data due to columnar format)

In this past i have written a dummywritable class that doesn't deserialize - but just passes
the inputstream passed in by hadoop to the application. (the serialization framework does
this in a less hacky way - and we could do that as well). if u do it this way - hive serde
can get a massive blob of binary data - and then based on header metadata - only decompress
the relevant parts of it. 

ie - i don't think we ever need to do B2.1 if we do B2.2 this way. 

> Make Hive support column based storage
> --------------------------------------
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message