cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13291) Replace usages of MessageDigest with Guava's Hasher
Date Sat, 11 Mar 2017 10:29:04 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906159#comment-15906159
] 

Robert Stupp commented on CASSANDRA-13291:
------------------------------------------

[~mkjellman], the patch doesn't cleanly apply to trunk (conflicts with your CLIbrary patch
;) ). Can you provide a rebased branch for this on GitHub?

Some comments & nits:
* {{FBUtilities.threadLocalMD5Digest}} should this be refactored to Guava, too. Since it's
always MD5, it requires a separate {{HashFunction}} constant. That way we could get rid of
the thread local, which seems no longer needed, since Guava prefers the less expensive "clone"
instead of {{MessageDigest.getInstance}}.
* {{MessagingService.CURRENT_HASH_FUNCTION}} is probably a bad place as it introduces a new
class dependency to MS in some places, which is unfortunate for offline tools. Would not mind
to place it in {{FBUtilities}}.
* {{ClusteringPrefix.digest}} - the javadoc parameter needs to be renamed, too
* {{FBUtilities}} - any chance to avoid the use of {{JCAUtil}}?
* {{MessagingService}} - unused import {{Hasher}}
* {{PartitionTest}} - looks like {{assertNotSame}} should be {{assertNotEquals}}
* {{CountingDigest}} I really like the removal of {{extends MessageDigest}}.
* Would you agree that both MD5 and SHA256 use the "clone" approach to create new hasher instances?


> Replace usages of MessageDigest with Guava's Hasher
> ---------------------------------------------------
>
>                 Key: CASSANDRA-13291
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13291
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Michael Kjellman
>            Assignee: Michael Kjellman
>         Attachments: CASSANDRA-13291-trunk.diff
>
>
> During my profiling of C* I frequently see lots of aggregate time across threads being
spent inside the MD5 MessageDigest implementation. Given that there are tons of modern alternative
hashing functions better than MD5 available -- both in terms of providing better collision
resistance and actual computational speed -- I wanted to switch out our usage of MD5 for alternatives
(like adler128 or murmur3_128) and test for performance improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  switching out
the hashing function to something like adler128 or murmur3_128 (for example) -- which don't
ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest directly
in favor of Hasher from Guava. This means going forward we can change a single line of code
to switch the hashing algorithm being used (assuming there is an implementation in Guava).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message