cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radim Kolar <>
Subject Re: reported bloom filter FP ratio
Date Mon, 26 Dec 2011 18:56:37 GMT
my missunderstanding of FP ratio was based on assumption that ratio is 
counted from node start, while it is getRecentBloomFilterFalseRatio()

 > I don't understand how you reached that conclusion.

On my nodes most memory is consumed by bloom filters. Also 1.0 creates 
larger bloom filters than 0.8 leading to higher memory consumption, i 
just checked few sstables for index to bloom filter ratio on same 
dataset. in 0.8 bloom filters are about 13% of index size and in 1.0, 
its about 16%. Key used in CF is fixed size 4byte integer.

Cassandra does not measure memory used by index sampling yet, i suspect 
that it will be memory hungry too and can be safely lowered by default i 
see very little difference by changing index sampling from 64 to 512.

Basic problem with cassandra daily administration which i am currently 
solving is that memory consumption grows with your dataset size. I dont 
really like this design - you put more data in and cluster can OOM. This 
makes cassandra not optimal solution for use in data archiving. It will 
get better after tunable bloom filters will be committed.

View raw message