incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: BloomFilter is taking too much memory
Date Tue, 04 May 2010 20:50:28 GMT
BloomFilter is not redundant, because it stores information about
_all_ keys while the index summary stores every 1/128 key.

On Tue, May 4, 2010 at 3:47 PM, Weijun Li <> wrote:
> Hello,
> We stored about 47mil keys in one Cassandra node and what a memory dump
> shows for one of the SStableReader:
>     SSTableReader: 386MB. Among this 386MB, IndexSummary takes about 231MB
> but BloomFilter takes 155MB with an embedded huge array long[19.4mil].
> It seems that BloomFilter is taking too much memory. If this is the case
> BloomFilter seems to be redundant comparing to the size of index.
> So is this desired behavior? Is there a formula to estimate the size of
> needed memory for BloomFilter?
> Thanks,
> -Weijun

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support

View raw message