incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood" <stu.h...@rackspace.com>
Subject RE: boonfilters
Date Thu, 08 Apr 2010 01:07:06 GMT
> 1. Is the only place boonfilters are used in Cassandra is when you want to
> see if a particular key exists in a particular node?
Each node stores a set of SSTables locally, each with a BloomFilter attached: the filters
are used to check whether a particular SSTable contains information about a key/row. See http://wiki.apache.org/cassandra/MemtableSSTable
for more information on SSTables.

> 2. Are boonfilters a fixed size, or they adjust as to the # of keys?  any
> example size?
Since SSTables are immutable once written, the BloomFilter for an SSTable has a fixed size
based on the number of keys that will be stored in the SSTable. There are calculations embedded
in Cassandra that attempt to make an optimal size/false positive tradeoff: see org.apache.cassandra.utils.BloomCalculations.

> 3. ...
> It says "yes", but when you do a lookup the object returned is null
No: the BloomFilter answers the question "might there be data for this key in this SSTable?",
and the two possible answers it can give are "maybe" or "definitely not". When the BloomFilter
says "maybe" we have to go to disk to check out the content of the SSTable.

Thanks,
Stu

-----Original Message-----
From: "S Ahmed" <sahmed1020@gmail.com>
Sent: Wednesday, April 7, 2010 3:27pm
To: dev@cassandra.apache.org
Subject: boonfilters

Just reading up on boonfilters, few questions.

Basically boonfilters let give you a true/false if a particular key exists,
and they *may* give you a false positive i.e. they key exists but never a
false negative i.e. the key doesn't exist.

The core of boonfilters is its hashing mechanism that marks the in-memory
matrix/map if the key exists.

1. Is the only place boonfilters are used in Cassandra is when you want to
see if a particular key exists in a particular node?

2. Are boonfilters a fixed size, or they adjust as to the # of keys?  any
example size?

3. Boonfilters don't give false negatives:
    So you hit a node, and perform a lookup in the boonfilter for a key.  It
says "yes", but when you do a lookup the object returned is null, so then
you flag that this node needs this particular key during replication.


Have I grasp this concept?

Really loving this project, learning allot from the code.  It would be great
if someone could do a walkthrough of common functionality in a detailed way
:)



Mime
View raw message