hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-947) Add run length encoding into RCFile's block header
Date Mon, 23 Nov 2009 22:25:39 GMT

    [ https://issues.apache.org/jira/browse/HIVE-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781669#action_12781669

Ning Zhang commented on HIVE-947:

Yongqiang, can you profile a simple query that we talked the other day and see how much CPU
this can save? We should test on columns that are mostly the same length (e.g., type int)
and variable lengths (string). 

> Add run length encoding into RCFile's block header 
> ---------------------------------------------------
>                 Key: HIVE-947
>                 URL: https://issues.apache.org/jira/browse/HIVE-947
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>            Priority: Minor
>         Attachments: hive-947-2009-11-22.patch
> When RCFile constructing rows, it needs to get column value's length via calling readVLong().
And this should be avoided for fix length or most fix length columns. 
> This also should not influence old rcfile files, which means it should also work correctly
on files with previous RCFile format.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message