As you havent specified all the details pertaining to filters and your data layout (structure) at a very high level what i can suggest is that you need to create a seperate CF for each filter.


On Sat, May 1, 2010 at 5:04 PM, Rakesh Rajan <rakeshxp@gmail.com> wrote:
I am evaluating cassandra to implement activity streams. We currently have over 1000000 feeds with total entries exceeding 320000000 implemented using redis ( ~320 entries / feed). Would like hear from the community on how to use cassandra to solve the following cases:
  1. Ability to fetch entries by applying a few filters ( like show me only likes from a given user). This would include range query to support pagination. So this would mean indices on a few columns like the feed id, feed type etc.
  2. We have around 3 machines with 4GB RAM for this purpose and thinking of having replication factor 2. Would 4GB * 3 be enough for cassandra for this kind of data? I read that cassandra does not keep all the data in memory but want to be sure that we have the right server config to handle this data using cassandra.
Thanks,
Rakesh