hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Krishna Kumar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-2097) Explore mechanisms for better compression with RC Files
Date Mon, 14 May 2012 08:02:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Krishna Kumar updated HIVE-2097:
--------------------------------

    Attachment: datacomp.tar.gz

unrefactored source for all the implemented compression codecs
                
> Explore mechanisms for better compression with RC Files
> -------------------------------------------------------
>
>                 Key: HIVE-2097
>                 URL: https://issues.apache.org/jira/browse/HIVE-2097
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor, Serializers/Deserializers
>            Reporter: Krishna Kumar
>            Assignee: Krishna Kumar
>            Priority: Minor
>         Attachments: datacomp.tar.gz
>
>
> Optimization of the compression mechanisms used by RC File to be explored.
> Some initial ideas
>  
> 1. More efficient serialization/deserialization based on type-specific and storage-specific
knowledge.
>  
>    For instance, storing sorted numeric values efficiently using some delta coding techniques
> 2. More efficient compression based on type-specific and storage-specific knowledge
>    Enable compression codecs to be specified based on types or individual columns
> 3. Reordering the on-disk storage for better compression efficiency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message