cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Yeksigian (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables
Date Mon, 22 Feb 2016 20:52:18 GMT


Carl Yeksigian commented on CASSANDRA-9830:

The call ordering in {{SSTableReader.releaseBloomFilter}} is wrong; it's causing us not to
reclaim the memory used by the bloom filter. We're currently releasing the newly created {{AlwaysPresentFilter}}
instead of the previous bloom filter.

It looks like your new branch is skip_top_level_bloom_filter_volatile; can you run the unit
and dtests and add links here?

{{disable_top_level_bloom_filter}} SGTM.

It seems a little unintuitive that changing {{skip_top_level_bloom_filter}} from true to false
will not cause the top level to have bloom filters -- the other way is true, though (you delete
all the top level bloom filters when going to skip bf).  We should probably generate a client-side
warning when trying to make that change, just so users are aware it won't be regenerated until
compaction regenerates the bfs.

We should be able to use {{LeveledCompactionStrategy.addSSTable}} and {{{{LeveledCompactionStrategy.startup}}
instead of adding a new call into the ACS hierarchy. Also, this way we know which sstable
we have just added, instead of having to iterate over all of the ones in the top level. We
just need to add a check to {{isActive}} in {{addSSTable}} to make sure we aren't just now
adding all sstables (and not necessarily looking at the right levels), and that we look at
all of the top level in {{startup}}.

It looks like we use {{instanceof AlwaysPresentFilter}} in CompactionController; we should
consolidate that and {{IFilter.isAlwaysPresent()}}.

> Option to disable bloom filter in highest level of LCS sstables
> ---------------------------------------------------------------
>                 Key: CASSANDRA-9830
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Jonathan Ellis
>            Assignee: Paulo Motta
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.x
> We expect about 90% of data to be in the highest level of LCS in a fully populated series.
 (See also CASSANDRA-9829.)
> Thus if the user is primarily asking for data (partitions) that has actually been inserted,
the bloom filter on the highest level only helps reject sstables about 10% of the time.
> We should add an option that suppresses bloom filter creation on top-level sstables.
 This will dramatically reduce memory usage for LCS and may even improve performance as we
no longer check a low-value filter.
> (This is also an idea from RocksDB.)

This message was sent by Atlassian JIRA

View raw message