cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables
Date Fri, 08 Jan 2016 19:27:40 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089775#comment-15089775
] 

Paulo Motta commented on CASSANDRA-9830:
----------------------------------------

bq. The cstar runs that you kicked off didn't work because you aren't on the list of repos
for cstar, so I kicked off a new ssd test. The increase looks modest, but there is an improvement.

Thanks for triggering the test again!

bq. Also, is there any way to get memory usage during the tests?

The last {{nodetool tablestats}} command should print memory usage stats in the console, but
for some reason the console output is unavailable. Is there any way to retrieve it manually
[~enigmacurry]?

bq. I think we should be skipping creating the bloom filter for the leveled major compaction
as well. That's because in major compaction, while we aren't always adding at what ends up
being the highest level after we are done, we are always writing the highest level for a given
key. Plus, this will ensure that whichever level ends up as the highest will not have bloom
filters.

Thanks for the feedback. That's right, we should definitely support this in major leveled
compaction. I will update the patch post back soon.

> Option to disable bloom filter in highest level of LCS sstables
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-9830
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Jonathan Ellis
>            Assignee: Paulo Motta
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.x
>
>
> We expect about 90% of data to be in the highest level of LCS in a fully populated series.
 (See also CASSANDRA-9829.)
> Thus if the user is primarily asking for data (partitions) that has actually been inserted,
the bloom filter on the highest level only helps reject sstables about 10% of the time.
> We should add an option that suppresses bloom filter creation on top-level sstables.
 This will dramatically reduce memory usage for LCS and may even improve performance as we
no longer check a low-value filter.
> (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message