cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas GERBET <tho...@gerbet.me>
Subject Impact of Bloom filter false positive rate
Date Fri, 30 May 2014 14:02:36 GMT
Hi,

I'm currently working on some properties of Bloom filters and this is the
first time I use Cassandre, so I'm sorry if my question seems dumb.
Basically, I try to see the impact of the false positive rate of Bloom
filter on performance.

My test case is:
1. I create a table with:
create table bloom.test_fp (t text primary key, d text) with
bloom_filter_fp_chance = <fp_rate>
2. I fill this table with 100000 rows using random data
3. I force the creation of SSTable by flushing Memtable with nodetool flush
4. I mesure the time required to perform 1000000 basic queries like select
* from bloom.test_fp where t = <random_data>

 Surprisingly, there is not much difference depending on the false positive
rate selected. I suspect some caches interfere.

Is there a way for me to see the impact on performance without using large
dataset?

Thanks.

Mime
View raw message