cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adarsh Kumar <adarsh0...@gmail.com>
Subject Re: Setting bloom_filter_fp_chance < 0.01
Date Thu, 19 May 2016 03:53:59 GMT
Hi Sai,

We have a use case where we are designing a table that is going to have
around 50 billion rows and we require a very fast reads. Partitions are not
that complex/big, it has
some validation data for duplicate checks (consisting 4-5 int and varchar).
So we were trying various options to optimize read performance. Apart from
tuning Bloom Filter we are trying following thing:

1). Better data modelling (making appropriate partition and clustering keys)
2). Trying Leveled compaction (changing data model for this one)

Jonathan,

I understand that tuning bloom_filter_fp_chance will not have a drastic
performance gain.
But this is one of the many tings we are trying.
Please let me know if you have any other suggestions to improve read
performance for this volume of data.

Also please let me know any performance benchmark technique (currently we
are planing to trigger massive reads from spark and check cfstats).

NOTE: we will be deploying DSE on EC2, so please suggest if you have
anything specific to DSE and EC2.

Adarsh

On Wed, May 18, 2016 at 9:45 PM, Jonathan Haddad <jon@jonhaddad.com> wrote:

> The impact is it'll get massively bigger with very little performance
> benefit, if any.
>
> You can't get 0 because it's a probabilistic data structure.  It tells you
> either:
>
> your data is definitely not here
> your data has a pretty decent chance of being here
>
> but never "it's here for sure"
>
> https://en.wikipedia.org/wiki/Bloom_filter
>
> On Wed, May 18, 2016 at 11:04 AM sai krishnam raju potturi <
> pskraju88@gmail.com> wrote:
>
>> hi Adarsh;
>>     were there any drawbacks to setting the bloom_filter_fp_chance  to
>> the default value?
>>
>> thanks
>> Sai
>>
>> On Wed, May 18, 2016 at 2:21 AM, Adarsh Kumar <adarsh0007@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> What is the impact of setting bloom_filter_fp_chance < 0.01.
>>>
>>> During performance tuning I was trying to tune bloom_filter_fp_chance
>>> and have following questions:
>>>
>>> 1). Why bloom_filter_fp_chance = 0 is not allowed. (
>>> https://issues.apache.org/jira/browse/CASSANDRA-5013)
>>> 2). What is the maximum/recommended value of bloom_filter_fp_chance (if
>>> we do not have any limitation for bloom filter size).
>>>
>>> NOTE: We are using default SizeTieredCompactionStrategy on
>>> cassandra  2.1.8.621
>>>
>>> Thanks in advance..:)
>>>
>>> Adarsh Kumar
>>>
>>
>>

Mime
View raw message