hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-875) Use MurmurHash instead of JenkinsHash
Date Tue, 23 Sep 2008 08:54:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633657#action_12633657
] 

Andrzej Bialecki  commented on HBASE-875:
-----------------------------------------

Re: deserialization. Sure, hash values can be anything. But the first parameter in the old
format is the number of hash functions to use, not the hash value. so it can't be negative.

Re: configuration. I was of a split mind on this, but if we allowed configuring hash function
in these cases, then we would have to persist this information somewhere in the data, which
sounds kind of messy - so I decided against it. Perhaps the name of the property should indicate
that it affects only BloomFilters ... OTOH some day we may want to use this conf. knob in
other places too.

> Use MurmurHash instead of JenkinsHash
> -------------------------------------
>
>                 Key: HBASE-875
>                 URL: https://issues.apache.org/jira/browse/HBASE-875
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 0.19.0
>            Reporter: Andrzej Bialecki 
>         Attachments: murmur.patch
>
>
> I recently ported the MurmurHash (http://murmurhash.googlepages.com/) to Java, and according
to my tests it's roughly 5 times faster than the current version of JenkinsHash in the trunk/
. According to the author (and other analysts at comp.sci.crypt) this hash has an excellent
avalanche behavior, and low collision rate. I propose to either replace the JenkinsHash or
add this hash as an option to be used in BloomFilter-s and related classes.
> If your opinion is positive, I'll prepare a patch. The Java implementation of the hash
can be found here: http://www.getopt.org/murmur/MurmurHash.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message