hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8584) Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s) shorter on Windows than Linux
Date Fri, 24 Oct 2014 02:26:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182333#comment-14182333
] 

Lefty Leverenz commented on HIVE-8584:
--------------------------------------

The wiki has a few places this could be mentioned:

* [ORC -- Compression | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-Compression]
* orc.compress table property in [ORC -- HiveQLSyntax | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-HiveQLSyntax]
* [hive.exec.orc.default.compress in Configuration Properties | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.default.compress]
* no ORC discussion, just Gzip & Bzip2 for TextFile (doc needs updating):  [Compressed
Data Storage | https://cwiki.apache.org/confluence/display/Hive/CompressedStorage]

> Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s)
shorter on Windows than Linux
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-8584
>                 URL: https://issues.apache.org/jira/browse/HIVE-8584
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>         Environment: Windows
>            Reporter: Xiaobing Zhou
>            Assignee: Xiaobing Zhou
>            Priority: Critical
>         Attachments: HIVE-8584.1.patch, orc-win-none-1.dump, orc-win-none-2.dump, orc-win-snappy-1.dump,
orc-win-snappy-2.dump, orc-win-zlib-1.dump, orc-win-zlib-2.dump, orc_analyze.q
>
>
> repo steps:
> 1. run query orc_analyze.q
> 2. hive --orcfiledump <target_orc_file_generated>
> run 1 and 2 on PST timezone on Linux, and one more time on other timezone e.g. CST on
Windows.
> Compare two target orc file dumping. Windows orc file is 1 byte shorter than Linux one.
> That's the case even if running 1 and 2 on Windows for different timezones, however,
no problem on Linux.
> The issue only exists by using ZLIB mode, eventually OS native compression lib is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message