cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables
Date Mon, 22 Feb 2016 19:30:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157538#comment-15157538
] 

Paulo Motta commented on CASSANDRA-9830:
----------------------------------------

The minor and major compaction cstar test is completed. Surprisingly enough, now the situation
is somewhat inverted with expected results on the major compaction test, and unexpected results
on the minor compactions test. The other tests failed, apparently due to unrelated cstar problems,
so I will resubmit them.

In the major compaction test the bloom filter size is about 50% smaller without decrease in
performance. In the minor compaction test the drop in BF size is modest, and in one of the
machines is actually bigger, what is quite strange.

I suspect this is due to the fact that the {{SStableReader.bf}} variable is *not* {{volatile}},
so for some reason the bloom filter drop is not being propagated to other threads in the minor
compaction test. I will resubmit the tests with a new branch making the {{bf}} variable volatile.
Do you have any other hypothesis to explain this?

Resubmitted tests:

* [inc repair|http://cstar.datastax.com/tests/id/800121ce-d987-11e5-bfef-0256e416528f]
* [higher bffp chance|http://cstar.datastax.com/tests/id/800121ce-d987-11e5-bfef-0256e416528f]
* [minors volatile|http://cstar.datastax.com/tests/id/ae11fc02-d999-11e5-bfef-0256e416528f]
* [minors with flush|http://cstar.datastax.com/tests/id/340986c2-d99a-11e5-bfef-0256e416528f]

Results below:

*LCS with major compaction*

* blade-11-6a:
** trunk:
{noformat}
        SSTables in each level: [0, 10, 11, 0, 0, 0, 0, 0, 0]
        Space used (live): 3688151652
        Space used (total): 3688151652
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 8017800
        Bloom filter off heap memory used: 8017632
{noformat}
** patched:
{noformat}
        SSTables in each level: [0, 10, 11, 0, 0, 0, 0, 0, 0]
        Space used (live): 3819623608
        Space used (total): 3819623608
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 3822080
        Bloom filter off heap memory used: 3822000
{noformat}

* blade-11-7a
** trunk:
{noformat}
        SSTables in each level: [0, 10, 11, 0, 0, 0, 0, 0, 0]
        Space used (live): 3695491016
        Space used (total): 3695491016
        Bloom filter false positives: 1
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 8026368
        Bloom filter off heap memory used: 8026200
{noformat}
** patched:
{noformat}
        SSTables in each level: [0, 10, 11, 0, 0, 0, 0, 0, 0]
        Space used (live): 3822002992
        Space used (total): 3822002992
        Bloom filter false positives: 1
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 3818000
        Bloom filter off heap memory used: 3817920
{noformat}
* blade-11-8a:
** trunk:
{noformat}
        SSTables in each level: [0, 10, 11, 0, 0, 0, 0, 0, 0]
        Space used (live): 3688261843
        Space used (total): 3688261843
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 8017800
        Bloom filter off heap memory used: 8017632
{noformat}
** patched:
{noformat}
        SSTables in each level: [0, 10, 11, 0, 0, 0, 0, 0, 0]
        Space used (live): 3817162026
        Space used (total): 3817162026
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 3818000
        Bloom filter off heap memory used: 3817920
{noformat}

*LCS with minor/organic compactions:*

* blade-11-6a:
** trunk:
{noformat}
        SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
        Space used (live): 4553622348
        Space used (total): 4553622348
        Bloom filter false positives: 12
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 57081688
        Bloom filter off heap memory used: 57081480
{noformat}
** patched:
{noformat}
        SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
        Space used (live): 4661123571
        Space used (total): 4661123571
        Bloom filter false positives: 91
        Bloom filter false ratio: 0.00001
        Bloom filter space used: 50248400
        Bloom filter off heap memory used: 50248320
{noformat}

* blade-11-7a:
** trunk:
{noformat}
        SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
        Space used (live): 4555354837
        Space used (total): 4555354837
        Bloom filter false positives: 10
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 65744400
        Bloom filter off heap memory used: 65744192
{noformat}
** patched:
{noformat}
        SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
        Space used (live): 4668234630
        Space used (total): 4668234630
        Bloom filter false positives: 115
        Bloom filter false ratio: 0.00001
        Bloom filter space used: 50272560
        Bloom filter off heap memory used: 50272480
{noformat}

* blade-11-8a
** trunk:
{noformat}
        SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
        Space used (live): 4553532275
        Space used (total): 4553532275
        Bloom filter false positives: 9
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 57081520
        Bloom filter off heap memory used: 57081312
{noformat}
** patched:
{noformat}
        SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
        Space used (live): 4677259623
        Space used (total): 4677259623
        Bloom filter false positives: 76
        Bloom filter false ratio: 0.00001
        Bloom filter space used: 59167920
        Bloom filter off heap memory used: 59167840
{noformat}

> Option to disable bloom filter in highest level of LCS sstables
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-9830
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Jonathan Ellis
>            Assignee: Paulo Motta
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.x
>
>
> We expect about 90% of data to be in the highest level of LCS in a fully populated series.
 (See also CASSANDRA-9829.)
> Thus if the user is primarily asking for data (partitions) that has actually been inserted,
the bloom filter on the highest level only helps reject sstables about 10% of the time.
> We should add an option that suppresses bloom filter creation on top-level sstables.
 This will dramatically reduce memory usage for LCS and may even improve performance as we
no longer check a low-value filter.
> (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message