hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-352) Make Hive support column based storage
Date Thu, 16 Apr 2009 14:36:14 GMT

     [ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

He Yongqiang updated HIVE-352:
------------------------------

    Attachment: hive-352-2009-4-16.patch

against the latest truck.

1) added a simple rcfile_columar.q file for test.
{noformat}
DROP TABLE columnTable;
CREATE table columnTable (key STRING, value STRING)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.ColumnarSerDe'
STORED AS
  INPUTFORMAT 'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
  OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.RCFileOutputFormat';

FROM src
INSERT OVERWRITE TABLE columnTable SELECT src.key, src.value LIMIT 10;
describe columnTable;

SELECT columnTable.* FROM columnTable;
{noformat}

2) let ColumnarSerDe's serialize returns BytesRefArrayWritable instead of Text

BTW, it seems rcfile_columar.q.out does not contain results of SELECT columnTable.* FROM columnTable;

but after the test, i saw file ql/test/data/warehouse/columntable/attempt_local_0001_r_000000_0,
and it did contain the data inserted. 
Why the select got nothing?

> Make Hive support column based storage
> --------------------------------------
>
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>         Attachments: hive-352-2009-4-15.patch, hive-352-2009-4-16.patch, HIve-352-draft-2009-03-28.patch,
Hive-352-draft-2009-03-30.patch
>
>
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message