cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radim Kolar <>
Subject Re: configurable bloom filters (like hbase)
Date Wed, 14 Dec 2011 10:52:28 GMT
Dne 11.11.2011 7:55, Radim Kolar napsal(a):
> i have problem with large CF (about 200 billions entries per node). 
> While i can configure index_interval to lower memory requirements, i 
> still have to stick with huge bloom filters.
> Ideal would be to have bloom filters configurable like in hbase. 
> Cassandra standard is about 1.05% false possitive but in my case i 
> would be fine even with 20% false positive rate. Data are not often 
> read back. Most of them will be never read before they expire via TTL.
anybody other has problem that bloom filters are using too much memory 
in applications which do not needs to read written data often?

I am looking at bloom filters memory used and it would be ideal to have 
in cassandra-1.1 ability to shrink bloom filters to about 1/10 of their 
size. Is possible to code something like this: save bloom filters to 
disk as usual but during load, transform them into something smaller at 
cost increasing FP rate?

View raw message