cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-8413) Bloom filter false positive ratio is not honoured
Date Wed, 03 Dec 2014 11:45:12 GMT
Benedict created CASSANDRA-8413:
-----------------------------------

             Summary: Bloom filter false positive ratio is not honoured
                 Key: CASSANDRA-8413
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8413
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Benedict
             Fix For: 2.0.12, 2.1.3


Whilst thinking about CASSANDRA-7438 and hash bits, I realised we have a problem with sabotaging
our bloom filters when using the murmur3 partitioner. I have performed a very quick test to
confirm this risk is real.

Since a typical cluster uses the same murmur3 hash for partitioning as we do for bloom filter
lookups, and we own a contiguous range, we can guarantee that the top X bits collide for all
keys on the node. This translates into poor bloom filter distribution. I quickly hacked LongBloomFilterTest
to simulate the problem, and the result in these tests is _up to_ a doubling of the actual
false positive ratio. The actual change will depend on the key distribution, the number of
keys, the false positive ratio, the number of nodes, the token distribution, etc. But seems
to be a real problem for non-vnode clusters of at least ~128 nodes in size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message