cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-1555) Considerations for larger bloom filters
Date Wed, 29 Sep 2010 04:44:33 GMT
Considerations for larger bloom filters
---------------------------------------

                 Key: CASSANDRA-1555
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1555
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Stu Hood
             Fix For: 0.8


To (optimally) support SSTables larger than 143 million keys, we need to support bloom filters
larger than 2 GB, which java.util.BitSet can't handle directly.

A few options:
* Switch to a BitSet class which supports 2^63 bits (Lucene's OpenBitSet)
* Partition the java.util.BitSet behind our current BloomFilter
** Straightforward bit partitioning: bit N is in bitset N // 2^31
** Separate equally sized complete bloom filters for member ranges, which can be used independently
or OR'd together under memory pressure.

All of these options require new approaches to serialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message