On Tuesday, October 2, 2012 at 17:06, Jeremy Hanna wrote:
Another option that may or may not work for you is the support in Cassandra 1.1+ to use a secondary index as an input to your mapreduce job. What you might do is add a field to the column family that represents which virtual column family that it is part of. Then when doing mapreduce jobs, you could use that field as the secondary index limiter. Secondary index mapreduce is not as efficient since you first get all of the keys and then do multigets to get the data that you need for the mapreduce job. However, it's another option for not scanning the whole column family.