you can try using different CF for different result sets or inverted index. but looking at the number of inserts that you have..it will become complicated. The first thing that you need to do is stop thinking in terms of any RDBMS as cassandra is not at all like them.
Let your email find you....
Cassandra doesn't support adhoc queries, like what you're describing
I recommend looking at Lucandra
On 9/2/2010 12:27 PM, Anuj Kabra wrote:I am working with cassandra-0.6.4. I am working on mail retreival problem. We have the metadata of mail like sender, recipient, timestamp, subject and the location of mail file stored in a cassandra DB.Everyday about 25,000 records will
be entered to this DB. We have not finalised on the data model yet but starting with a simple one having only one column family.
<ColumnFamily name="MailMetadata" CompareWith="UTF8Type">
which have user_id of recipient as key.and columns for sender_id, timestamp of mail, subject and location of mail file.
Now our Use case is to get the locations of all mail files which are being sent by a user matching a given subject(can be a part of the original subject of mail). Well according to my knowledge till now, we can get all the rows of a user
by using user_id as key. After that i need to iterate over all the rows i get and see which mail seems to fit the given condition.(matching a subject in this case), which is very heavy computationally as we would get thousands of rows.
So we are looking for something like "like" of mysql provided by thrift. I also need to know if am going the right way.
Help is much appreciated.