hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <>
Subject [jira] Commented: (HIVE-352) Make Hive support column based storage
Date Wed, 25 Mar 2009 16:34:54 GMT


He Yongqiang commented on HIVE-352:

One problem with this RCFile is that it needs to know the needed columns in advance, so it
can skip and avoid decompress unneeded columns. 
I took a look at Hive's operators and SerDe, it seems that they all take a whole row object
as input and do not know which column is needed before processing. 
Like with LazyStruct and StructObjectInspector, they only know which column is needed when
getField/getStructFieldData is invoked by operators' evalators( like ExprNodeColumnEvaluator).

> Make Hive support column based storage
> --------------------------------------
>                 Key: HIVE-352
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message