cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5038) LZ4Compressor
Date Thu, 06 Dec 2012 23:23:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13525993#comment-13525993
] 

Adrien Grand commented on CASSANDRA-5038:
-----------------------------------------

bq. Cool, yeah I'm not sure if we can use the "known size" decompressor, does it have to be
exact or can it be upper bounded? We know from the block size the max compressed length.

It needs to be exact, or decompression will fail. An option to be able to use it is to write
the original length as an int (or better as a variable-length int) before the compressed bytes.
Upon decompression, first read the original length and then use this original length to call
the "known size" decompressor.

bq.  I'd suggest you add a simple way for us to pick the best compressor for our node.

This is what the LZ4Factory#defaultInstance (I should probably rename it to fastestInstance)
aims at doing but it only tries unsafe then safe right now. I'll try to add support for the
native impl soon.

Another feature of these compressors you might be interested in is that you can provide them
with an output buffer of any length and they will succeed only if they managed to generate
an output which is small enough (and they will fail as soon as they know they won't make it).
So for example, you could decide to write the raw bytes instead of the compressed bytes if
LZ4 didn't manage to compress your data by more than 10%:

{code}
  final int maxAcceptableCompressedLength = originalLength * 90 / 100;
  try {
    dest[0] = 0; // means compressed
    final int compressedLength = compressor.compress(src, 0, originalLength, dest, 1, maxAcceptableCompressedLength);
    return 1 + compressedLength;
  } catch (LZ4Exception e) {
    dest[0] = 1; // means not compressed
    System.arraycopy(src, 0, dest, 1, originalLength);
    return 1 + originalLength;
  }
{code}
(Only the native LZ4 HC impl doesn't support this feature.)

                
> LZ4Compressor
> -------------
>
>                 Key: CASSANDRA-5038
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5038
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>             Fix For: 1.2.1
>
>         Attachments: LZ4Compressor.java, lz4-java.jar
>
>
> LZ4 is a new compression algo that's ~2x faster than Snappy.
> [~jpountz] has written a nice java port which includes a misc.Unsafe version that performs
>= than our java snappy version.
> Details at http://blog.jpountz.net/post/28092106032/wow-lz4-is-fast
> The nice thing is this should work with java7 and be more portable.
> We can also fallback the pure java impl

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message