If you want to do arbitrary complex online / realtime queries look at Data Stax Enterprise, or https://github.com/tjake/Solandra or straight Solr. 

Alternatively denormalise the model to materialise the results when you insert so you query is a straight lookup. Or do some client side filtering / aggregation. 

If you want to do the queries offline, you can use Pig or Hive with Hadoop over Cassandra. The Apache Cassandra distro includes the pig support, hive is coming (i think) and there are Hadoop interfaces.  You can also look at Data Stax Enterprise. 


Aaron Morton
Freelance Developer

On 31/05/2012, at 11:07 PM, Nury Redjepow wrote:

We want to use cassandra to store complex data. But we can't figure out, how to organize indexes.

Our table (column family) looks like this:

Users = { RandomId int, Firstname varchar, Lastname varchar, Age int, Country int, ChildCount int }

In our queries we have mandatory fields (Firstname,Lastname,Age) and extra search options (Country,ChildCount). How do we organize index to make this kind of queries fast?

First I thought, it would be natural to make composite index on (Firstname,Lastname,Age) and add separate secondary index on remaining fields (Country and ChildCount). But I can't insert rows into table after creating secondary indexes. And also, I can't query the table.

I'm using cassandra 1.1.0, and cqlsh with --cql3 option.

Any other suggestions to solve our problem (complex queries with mandatory and additional options) are welcome.

The main point is, how can we join data in cassandra. If I make few index column families, I need to intersect the values, to get rows that pass all search criteria??? Or should I use something based on Hadoop (Pig,Hive) to make such queries?

Respectfully, Nury