incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Strauss <da...@fourkitchens.com>
Subject Re: bloom filter
Date Fri, 07 May 2010 11:20:24 GMT
On 2010-05-07 11:03, vineet daniel wrote:
> 2. "It is also important for identifying which SSTable files to look inside
> even when a key is present." - David can you please throw some more
> light on your point, like what are the implications, why do we need to
> identify etc.

A bloom filter is almost like a street sign that tells you the range of
addresses on a street block. Such a street sign doesn't guarantee the
whole range of addresses exists on the block, but it does mean you can
avoid driving down streets that don't contain the address you're looking
for.

When Cassandra is looking for a key, there could be several files that
potentially contain it. By looking at the bloom filter for each, it can
avoid looking inside the files that definitely do not have the desired data.

(My analogy breaks down a bit here because the street signs indicate
mutually exclusive ranges of addresses, while the bloom filters may
indicate the possible presence of a key in *several* files.)

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]


Mime
View raw message