cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Allsopp (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2975) Upgrade MurmurHash to version 3
Date Sun, 20 Nov 2011 19:47:54 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153858#comment-13153858
] 

David Allsopp commented on CASSANDRA-2975:
------------------------------------------

Have just repeated the benchmark with HeapByteBuffer as input rather than DirectByteBuffer
(using ByteBuffer allocate() rather than allocateDirect()), and the performance improvement
seems to almost vanish.

The input to MurmurHash within Cassandra seems to be a HeapByteBuffer (based on adding a println
to the existing MurmurHash2 hash64() method), so the inlining is probably of no benefit in
practice.
                
> Upgrade MurmurHash to version 3
> -------------------------------
>
>                 Key: CASSANDRA-2975
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2975
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Brian Lindauer
>            Assignee: Brian Lindauer
>            Priority: Trivial
>              Labels: lhf
>             Fix For: 1.1
>
>         Attachments: 0001-Convert-BloomFilter-to-use-MurmurHash-v3-instead-of-.patch,
0002-Backwards-compatibility-with-files-using-Murmur2-blo.patch, Murmur3Benchmark.java
>
>
> MurmurHash version 3 was finalized on June 3. It provides an enormous speedup and increased
robustness over version 2, which is implemented in Cassandra. Information here:
> http://code.google.com/p/smhasher/
> The reference implementation is here:
> http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136&r=136
> I have already done the work to port the (public domain) reference implementation to
Java in the MurmurHash class and updated the BloomFilter class to use the new implementation:
> https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93
> Apart from the faster hash time, the new version only requires one call to hash() rather
than 2, since it returns 128 bits of hash instead of 64.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message