hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4123) The RLE encoding for ORC can be improved
Date Fri, 08 Aug 2014 02:12:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090193#comment-14090193
] 

Lefty Leverenz commented on HIVE-4123:
--------------------------------------

Doc questions:  Would it be okay to restore part of the original description for *hive.exec.orc.write.format*
in the wiki (and later in HiveConf.java)?

* current description is just "Define the version of the file to write" -- that doesn't give
any idea about possible values, since the default is null, and it isn't clear that "version
of the file" means Hive version
* original description was "use 0.11 version of RLE encoding. if this conf is not defined
or any other value specified, ORC will use the new RLE encoding"

So I'd like to add "Possible values are 0.11, 0.12, etc.  If this parameter is not defined,
ORC will use the RLE encoding introduced in Hive 0.12.  Any value other than 0.11 results
in the 0.12 encoding."

Is that accurate?  Can releases be specified as "0.12.0" or "0.13.1"?



> The RLE encoding for ORC can be improved
> ----------------------------------------
>
>                 Key: HIVE-4123
>                 URL: https://issues.apache.org/jira/browse/HIVE-4123
>             Project: Hive
>          Issue Type: New Feature
>          Components: File Formats
>    Affects Versions: 0.12.0
>            Reporter: Owen O'Malley
>            Assignee: Prasanth J
>              Labels: TODOC12, orcfile
>             Fix For: 0.12.0
>
>         Attachments: HIVE-4123-8.patch, HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt,
HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt, HIVE-4123.6.txt, HIVE-4123.7.txt,
HIVE-4123.8.txt, HIVE-4123.8.txt, HIVE-4123.patch.txt, ORC-Compression-Ratio-Comparison.xlsx
>
>
> The run length encoding of integers can be improved:
> * tighter bit packing
> * allow delta encoding
> * allow longer runs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message