On Tue, May 4, 2010 at 4:55 PM, David Rosenstrauch <darose@darose.net> wrote:
> I've had some neat ideas that I'd like to tinker with for a distributed DB
> that implements a very different data model than Cassandra. However, I
> obviously don't want to reinvent the wheel - particularly because in the
> case of distributed systems, the wheel is quite complicated and hard to get
> right.
>
> What I'm thinking would make more sense then is to build on top of the
> Cassandra core (since it's obviously been implemented well and has been
> proven to scale quite nicely) and then implement my own middle/top layer(s).
>
> So I'm wondering:
>
> * Anyone know if such a thing has been attempted before? (And, if so, links
> to any stories about success / failure / tips.)
I believe Jun Rao and Sandeep Tata built a kind of chain replication
starting from Cassandra 0.4-ish. I don't think the code is available.
> * Would there happen to be any docs/blogs/emails providing useful tech info
> for such an effort?
I don't know of any, short of the articles about Cassandra's code
itself. Ran Tavory wrote an excellent survey piece:
http://prettyprint.me/2010/05/02/understanding-cassandra-code-base/
> * What I should include/exclude from the Cassandra source code to start
> building on? Or, in other words, which package(s) from the source would be
> considered to constitute the core layer?
I don't see any shortcuts here. You need to understand the code
enough to answer that question yourself.
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
|