hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6837) Support for LZMA compression
Date Tue, 28 Sep 2010 21:59:36 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915938#action_12915938
] 

Joydeep Sen Sarma commented on HADOOP-6837:
-------------------------------------------

thanks to everyone on getting lzma into hadoop. it seems to be very promising.

i have tried applying the latest patch to both hadoop-0.20 (yahoo/facebook branch) and common-
trunk. in both cases - when i try running TestCodec after compiling the native codec - i get
a sigsegv:

    [junit] Running org.apache.hadoop.io.compress.TestCodec
    [junit] #
    [junit] # An unexpected error has been detected by Java Runtime Environment:
    [junit] #
    [junit] #  SIGSEGV (0xb) at pc=0x00002aaad5215659, pid=16028, tid=1076017472
    [junit] #
    [junit] # Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0-b23 mixed mode linux-amd64)
    [junit] # Problematic frame:
    [junit] # C  [libhadoop.so.1.0.0+0x5659]  thisRead+0x49
    [junit] #

separate from this - i had a question about tuning the compression level. in my testing on
internal data using the lzma utility built from the SDK - i found a bunch of interesting option
that provided a more suitable compromise between compression ratio/cpu (-a0 -mfhc4 -d24 -fbxxx)
than the default. eyeing the 'level' based normalization - it seems i won't be able to quite
achieve the settings i want by specifying a level. so it seems that being able to configure
these options separately would be very useful.

> Support for LZMA compression
> ----------------------------
>
>                 Key: HADOOP-6837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6837
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>            Reporter: Nicholas Carlini
>            Assignee: Nicholas Carlini
>         Attachments: HADOOP-6837-lzma-1-20100722.non-trivial.pseudo-patch, HADOOP-6837-lzma-1-20100722.patch,
HADOOP-6837-lzma-2-20100806.patch, HADOOP-6837-lzma-3-20100809.patch, HADOOP-6837-lzma-4-20100811.patch,
HADOOP-6837-lzma-c-20100719.patch, HADOOP-6837-lzma-java-20100623.patch
>
>
> Add support for LZMA (http://www.7-zip.org/sdk.html) compression, which generally achieves
higher compression ratios than both gzip and bzip2.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message