hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tatu Saloranta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-6389) Add support for LZF compression
Date Mon, 01 Aug 2011 03:15:37 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073428#comment-13073428
] 

Tatu Saloranta commented on HADOOP-6389:
----------------------------------------

Lzf4hadoop project at github -- https://github.com/ning/lzf4hadoop -- now provides necessary
wrappers.
I hope to get more testing done to ensure interaction with hadoop abstractions work as intended;
assuming things go well, this could serve as the implementation to use. Or, if separate project
& maven-accessible artifacts are enough, maybe just add a link from documentation.

As to performance, see https://github.com/ning/jvm-compressor-benchmark .
LZF is the fastest pure java compressor tested; of all included codecs Snappy (which uses
JNI to use C impl of snappy codec) is faster for decompression, and about as fast for compression.

Compression rates between basic lempel-ziv implementations (quiclz, lzo, snappy, lzf) are
comparable; and all are significantly faster than basic deflate (but with lower compression
rates).


> Add support for LZF compression
> -------------------------------
>
>                 Key: HADOOP-6389
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6389
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io
>            Reporter: Tatu Saloranta
>
> (note: related to [HADOOP-4874])
> As per Doug's earlier comments, LZF does indeed look like a good compressor candidate
for fast compression/decompression, good enough compression rate.
> From my testing it seems at least twice as fast at compression, and somewhat faster for
decompressing than gzip.
> Code from [http://h2database.googlecode.com/svn/trunk/h2/src/main/org/h2/compress/] is
applicable, and I have tested it with json data.
> I hope to have more to spend on this in near future, but if someone else gets to this
first that'd be good too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message