cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <>
Subject Re: High Bloom filter false ratio
Date Thu, 18 Feb 2016 19:38:09 GMT
The bloom filter buckets the values in a small number of buckets. I have
been surprised by how many cases I see with large cardinality where a few
values populate a given bloom leaf, resulting in high false positives, and
a surprising impact on latencies!

Are you seeing 2:1 ranges between mean and worse case latencies (allowing
for gc times)?

Daemeon Reiydelle
On Feb 18, 2016 8:57 AM, "Tyler Hobbs" <> wrote:

> You can try slightly lowering the bloom_filter_fp_chance on your table.
> Otherwise, it's possible that you're repeatedly querying one or two
> partitions that always trigger a bloom filter false positive.  You could
> try manually tracing a few queries on this table (for non-existent
> partitions) to see if the bloom filter rejects them.
> Depending on your Cassandra version, your false positive ratio could be
> inaccurate:
> There are also a couple of recent improvements to bloom filters:
> *
> *
> On Thu, Feb 18, 2016 at 1:35 AM, Anishek Agarwal <>
> wrote:
>> Hello,
>> We have a table with composite partition key with humungous cardinality,
>> its a combination of (long,long). On the table we have
>> bloom_filter_fp_chance=0.010000.
>> On doing "nodetool cfstats" on the 5 nodes we have in the cluster we are
>> seeing  "Bloom filter false ratio:" in the range of 0.7 -0.9.
>> I thought over time the bloom filter would adjust to the key space
>> cardinality, we have been running the cluster for a long time now but have
>> added significant traffic from Jan this year, which would not lead to
>> writes in the db but would lead to high reads to see if are any values.
>> Are there any settings that can be changed to allow better ratio.
>> Thanks
>> Anishek
> --
> Tyler Hobbs
> DataStax <>

View raw message