incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Strauss <da...@fourkitchens.com>
Subject Re: boonfilters
Date Wed, 07 Apr 2010 23:17:43 GMT
On 2010-04-07 20:34, Peter Schüller wrote:
> Re-sizing a bloom filter implies re-creating it from scratch.

Not necessarily. Depending on your hash, you can sometimes shrink
without regeneration (and without other penalties). It's also sometimes
possible to enlarge the bloom filter without regeneration at the cost of
increased false positives you wouldn't have if you regenerated.

Here's a trivial counterexample to your statement, starting with a bloom
filter with two positions: odd and even.

I can shrink it down to a single bit with an "or" operation. I can
enlarge it to modulo 10 by filling in the odd bits if my current odd bit
is filled and the same with the even numbers and bit.

In the case of the shrink operation, I'm not regenerating or losing any
accuracy. In the case of the enlarge operation, I'll get considerably
more false positives than I would on regeneration, but my operation is
still correct.

Even with complex or cryptographic hashes, a bloom filter based on
using, say, the first X bits might be expandable or shrinkable without
regeneration.

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]


Mime
View raw message