cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3127) Message (inter-node) compression
Date Wed, 30 May 2012 15:05:24 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285721#comment-13285721
] 

Marcus Eriksson commented on CASSANDRA-3127:
--------------------------------------------

Built the version which always sends the smallest message, saw great results in compression
ratios, a standard stress test gave a 20% compression ratio, basically all messages where
compressed. The gain in checking which message was smallest was minimal.

A drawback was that memory usage increased quite a lot since we need to serialize the message,
compress and compare sizes instead of just serializing the message to the DataOutputStream

So, instead i just compressed all messages with good results

I attach both patches, they add a configuration option like;
+# internode_compression controls whether traffic between nodes is
+# compressed.
+# can be:  all  - all traffic is compressed
+#          dc   - traffic between different datacenters is compressed
+#          none - nothing is compressed.
+internode_compression: all

                
> Message (inter-node) compression
> --------------------------------
>
>                 Key: CASSANDRA-3127
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3127
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Marcus Eriksson
>            Priority: Minor
>         Attachments: CASSANDRA-3127.patch, CHECK_SIZES-CASSANDRA-3127.patch
>
>
> CASSANDRA-3015 adds compression of streams. But it could be useful to also compress some
messages.
> Compressing messages is easy, but what may be little bit trickier is when and what messages
to compress to get the best performances.
> The simple solution would be to just have it either always on or always off. But for
very small messages (gossip?) that may be counter-productive. On the other side of the spectrum,
this is likely always a good choice to compress for say the exchange of merkle trees across
data-centers. We could maybe define a size of messages after which we start to compress. Maybe
the option to only compress for cross data-center messages would be useful too (but I may
also just be getting carried away). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message