cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Tillotson <>
Subject Re: Second Cassandra users survey
Date Thu, 03 Nov 2011 12:46:33 GMT
I'm using Cassandra as a big graph database, loading large volumes of data live and linking
on the fly. 
The number of edges grow geometrically with data added, and need to be read to continue linking
the graph on the fly. 

Consequently, my problem is constrained by:
 * Predominantly read - especially when data gets large and reads are quasi random
 * I have lots of data to plow in, to be read
 * Although the problem scale out and possibly all be in RAM, it requires too much kit for
the to be viable 

So, my findings with Cassandra are:
 * Compaction is expensive, I need it but
   1) It takes away disk IO from my reads
   2) Destroys the file cache
   I've not had chance to do extensive tests with the Level db compaction
 * Compaction has been too hard to configure historically
 * Memory hungry

So for me the biggest features would be
 * Cheaper compaction -   
 * Lower memory usage
 * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
   I do a lot of checking against dynamic colnames  
The great features are that redundancy, and live addition of shards is available out of the

I've also experimented with Golden Orb and Triggered updates, I think there is a fair bit
that can be achieved in my problem with local data access. Through GoldenOrb and Hadoop writables
a managed to get both a BigTable and Pregel access model onto my Cassandra data. It was schema
specific, but provided a local compute model. 


From: Jonathan Ellis <>
To: user <>
Sent: Tuesday, 1 November 2011, 22:59
Subject: Second Cassandra users survey

Hi all,

Two years ago I asked for Cassandra use cases and feature requests.
[1]  The results [2] have been extremely useful in setting and
prioritizing goals for Cassandra development.  But with the release of
1.0 we've accomplished basically everything from our original wish
list. [3]

I'd love to hear from modern Cassandra users again, especially if
you're usually a quiet lurker.  What does Cassandra do well?  What are
your pain points?  What's your feature wish list?

As before, if you're in stealth mode or don't want to say anything in
public, feel free to reply to me privately and I will keep it off the


Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
View raw message