cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4303) Compressed bloomfilters
Date Sat, 02 Jun 2012 19:54:23 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288012#comment-13288012
] 

Peter Schuller commented on CASSANDRA-4303:
-------------------------------------------

I'm highly skeptical of not locking BF in memory.

Whenever you page out to disk, you're instantly killed from a disk I/O perspective since by
definition there will be absolutely no locality what-so-ever in the bloom filter access pattern,
nor will caching be efficient (with a sparsely accessed BF you're pulling in a 4k page or
more to read a single bit of information).

Put it this way, if even 1% of your bloom filter is not in memory, your performance will be
*abysmal* in relation to any CPU bound workload, if you're on platters.

I don't think CPU efficiency is the interest here, nor overhead of page faults. The problem
is rather that you will be absolutely killed at soon as even a tiny fraction is no longer
in memory.

SSD:s may change the abysmal bit since they are so fast that a multi-SSD machine will easily
be CPU bound with Cassandra, but then simply reading from the sstables isn't obviously slower
than looking up the bloom filter. I'd expect it to be faster in many cases (less I/O:s) if
you're relying on page cache for any significant amount of the bloom filter to not be in memory.
                
> Compressed bloomfilters
> -----------------------
>
>                 Key: CASSANDRA-4303
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4303
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Brandon Williams
>             Fix For: 1.2
>
>
> Very commonly, people encountering an OOM need to increase their bloom filter false positive
ratio to reduce memory pressure, since BFs tend to be the largest shareholder.  It would make
sense if we could alleviate the memory pressure from BFs with compression while maintaining
the FP ratio (at the cost of a bit of cpu) that some users have come to expect.  One possible
implementation is at http://code.google.com/p/javaewah/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message