cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <>
Subject [jira] [Comment Edited] (CASSANDRA-13291) Replace usages of MessageDigest with Guava's Hasher
Date Thu, 28 Sep 2017 12:22:00 GMT


Jason Brown edited comment on CASSANDRA-13291 at 9/28/17 12:21 PM:

I'm +1 on the code. [~mkjellman] wdyt about including the microbenchmark I put together?

UPDATE: Actually, I did have a few comments/questions:

- it would be nice to have a unit test class for {{HashingUtils}}. mostly, i'm thinking it
would be nice to cover the different cases in {{#updateBytes}}: empty buffer, on-heap, and
off-heap (large and small).
- {{RandomPartitioner#hashToBigInteger}} - is this suppossed to explicitly use {{MD5Digest#hash}}
still? If so, can we add a comment that it's intentionally doing so?
- for the {{Validator.CountingHasher#put*}} primitive functions, I think you need to increment
the {{count}} member field by the appropriate number of bytes. I *think* the way the existing
{{MessageDigest}} works is that for each byte {{#updateEngine}} is invoked, so we would count
it in the original.
- petty nits for for petty nits gods: clean up the added but unused imports on {{MessagingService}}
and {{SchemaConstants}}

was (Author: jasobrown):
I'm +1 on the code. [~mkjellman] wdyt about including the microbenchmark I put together?

> Replace usages of MessageDigest with Guava's Hasher
> ---------------------------------------------------
>                 Key: CASSANDRA-13291
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Michael Kjellman
>            Assignee: Michael Kjellman
>         Attachments: CASSANDRA-13291-trunk.diff
> During my profiling of C* I frequently see lots of aggregate time across threads being
spent inside the MD5 MessageDigest implementation. Given that there are tons of modern alternative
hashing functions better than MD5 available -- both in terms of providing better collision
resistance and actual computational speed -- I wanted to switch out our usage of MD5 for alternatives
(like adler128 or murmur3_128) and test for performance improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  switching out
the hashing function to something like adler128 or murmur3_128 (for example) -- which don't
ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest directly
in favor of Hasher from Guava. This means going forward we can change a single line of code
to switch the hashing algorithm being used (assuming there is an implementation in Guava).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message