incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Strauss <da...@fourkitchens.com>
Subject Re: Cassandra on top of B-Tree
Date Sun, 28 Mar 2010 21:33:04 GMT
On 2010-03-28 21:11, Primal Wijesekera wrote:
> I am a master student in UBC CS dept. I along with one of my lab mates are trying to
implement the Cassandra on top of a B-Tree implementation rather than of DHT approach that
we have right now. We hope to do benchmarking the two approaches and really want to see which
one scales better. 
> 
> In the lab we already have a project (which is not yet completed) on developing a Distributed
B-Tree on top of a Sinfonia like system. We would be trying to integrate the Cassandra source
with the B-tree preserving the rest of the Cassandra logic.
> 
> Since we are still in its very early stage of this experiment, thought of getting your
expert thoughts and comments on this and we were wondering whether this could be a potential
GSoc project as well.

I'm sorry, but it doesn't make much sense to run Cassandra on top of a
B-tree. Reorganizing indexes when writing goes against one of
Cassandra's primary design goals: streaming writes to disk as
efficiently as possible.

http://wiki.apache.org/cassandra/FAQ#reads_slower_writes

Additionally, there are *so many* other systems that do use B-tree
already. Why add it to Cassandra?

You may want to look at Project Voldemort, which can already distribute
data across servers similarly to Cassandra but (optionally) with
B-tree-based storage on each box. MongoDB also supports sharded data
with B-tree-based indexes. Finally, HBase is a distributed B-tree.

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]


Mime
View raw message