cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Yeksigian (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables
Date Mon, 11 Jan 2016 17:08:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092288#comment-15092288
] 

Carl Yeksigian commented on CASSANDRA-9830:
-------------------------------------------

A new run just finished, and the results look much more like what we would have expected.

Once you have finished adding support for the major LCS compaction skipping bloom filters,
we should add the {{compact}} step back in and check the results again, otherwise it looks
like this patch is having the desired effects of reducing memory usage while not reducing
performance.

* blade-11-2a
** trunk
    {noformat}
    SSTable count: 26
    SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
    Space used (live): 4666235284
    Space used (total): 4666235284
    Bloom filter false positives: 69
    Bloom filter false ratio: 0.00000
    Bloom filter space used: 99436392
    Bloom filter off heap memory used: 99436184
    {noformat}
** skip_top_level_bloom_filter
    {noformat}
    SSTable count: 26
    SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
    Space used (live): 4388657699
    Space used (total): 4388657699
    Bloom filter false positives: 586
    Bloom filter false ratio: 0.00002
    Bloom filter space used: 51519600
    Bloom filter off heap memory used: 51519520
    {noformat}
* blade-11-3a
** trunk
    {noformat}
    SSTable count: 26
    SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
    Space used (live): 4666224073
    Space used (total): 4666224073
    Bloom filter false positives: 51
    Bloom filter false ratio: 0.00000
    Bloom filter space used: 99436568
    Bloom filter off heap memory used: 99436360
    {noformat}
** skip_top_level_bloom_filter
    {noformat}
    SSTable count: 26
    SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
    Space used (live): 4394200299
    Space used (total): 4394200299
    Bloom filter false positives: 636
    Bloom filter false ratio: 0.00002
    Bloom filter space used: 51392880
    Bloom filter off heap memory used: 51392800
    {noformat}
* blade-11-4a
** trunk
    {noformat}
    SSTable count: 26
    SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
    Space used (live): 4590486878
    Space used (total): 4590486878
    Bloom filter false positives: 55
    Bloom filter false ratio: 0.00000
    Bloom filter space used: 58160752
    Bloom filter off heap memory used: 58160544
    {noformat}
** skip_top_level_bloom_filter
    {noformat}
    SSTable count: 26
    SSTables in each level: [0, 10, 16, 0, 0, 0, 0, 0, 0]
    Space used (live): 4388596350
    Space used (total): 4388596350
    Bloom filter false positives: 560
    Bloom filter false ratio: 0.00002
    Bloom filter space used: 51519600
    Bloom filter off heap memory used: 51519520
    {noformat}


> Option to disable bloom filter in highest level of LCS sstables
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-9830
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Jonathan Ellis
>            Assignee: Paulo Motta
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.x
>
>
> We expect about 90% of data to be in the highest level of LCS in a fully populated series.
 (See also CASSANDRA-9829.)
> Thus if the user is primarily asking for data (partitions) that has actually been inserted,
the bloom filter on the highest level only helps reject sstables about 10% of the time.
> We should add an option that suppresses bloom filter creation on top-level sstables.
 This will dramatically reduce memory usage for LCS and may even improve performance as we
no longer check a low-value filter.
> (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message