cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables
Date Wed, 16 Mar 2016 15:10:33 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197470#comment-15197470
] 

Paulo Motta commented on CASSANDRA-9830:
----------------------------------------

Ok, it seems after CASSANDRA-11344 we now have consistent and predictable results:

* Scenario A: organic compactions: bloom_filter_fp_chance = 0.1 vs lower bloom_filter_fp_chance
= 0.01
** Analysis: Savings are consistent with different bfpc values. Takeaway is that you can increase
bfpc while keeping the same memory footprint.

||[organic1a|http://cstar.datastax.com/tests/id/3c02e674-eab2-11e5-ac91-0256e416528f]||trunk||patched||savings||
||[organic1b (lower bloom_filter_fp_chance)|http://cstar.datastax.com/tests/id/3c67130e-eaff-11e5-b22b-0256e416528f]||trunk||patched||savings||
|node1|11684936|4772280|59.16%| |node1|23910064|9595248|59.87%|
|node2|11704648|4791896|59.06%| |node1|23412280|9595000|59.02%|
|node3|11954248|4792088|59.91%| |node1|23408696|9589704|59.03%|

* Scenario B: major compactions: bloom_filter_fp_chance = 0.1 vs lower bloom_filter_fp_chance
= 0.01
** Analysis: Savings are consistent with different bfpc values. Savings are slightly lower
probably due to difference in how bloom filters are allocated in major compactions, but probably
not something to worry about.

||[major1a|http://cstar.datastax.com/tests/id/5661a302-eab2-11e5-ac91-0256e416528f]||trunk||patched||savings||
||[major1b (lower bloom_filter_fp_chance)|http://cstar.datastax.com/tests/id/39f17b6e-eaff-11e5-b22b-0256e416528f]||trunk||patched||savings||
|node1|8026368|3818000|52.43%| |node1|16035264|7644000|52.33%|
|node2|8026368|3822080|52.38%| |node1|16052400|7644000|52.38%|
|node3|8026368|3822080|52.38%| |node1|16052400|7644000|52.38%|

* Scenario C: incremental repairs
** Analysis: Savings are still consistent with incremental repair. The savings are higher
probably due to sstables in the top level being moved from unrepaired to repaired in the highest
level after anticompaction, so there's a higher number of sstables in the top level, thus
higher savings.

||[repair1a|http://cstar.datastax.com/tests/id/9501e088-ea33-11e5-847f-0256e416528f]||trunk||patched||savings||
|node1|12234296|4112240|66.39%|
|node2|12695872|4187680|67.02%|
|node3|12694680|4183600|67.04%|

Rebased above branch without conflicts.

> Option to disable bloom filter in highest level of LCS sstables
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-9830
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Jonathan Ellis
>            Assignee: Paulo Motta
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.x
>
>
> We expect about 90% of data to be in the highest level of LCS in a fully populated series.
 (See also CASSANDRA-9829.)
> Thus if the user is primarily asking for data (partitions) that has actually been inserted,
the bloom filter on the highest level only helps reject sstables about 10% of the time.
> We should add an option that suppresses bloom filter creation on top-level sstables.
 This will dramatically reduce memory usage for LCS and may even improve performance as we
no longer check a low-value filter.
> (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message