I'm currently writing collected data continuously to Cassandra, having keys starting with a timestamp and a unique identifier (like 2009.01.01.00.00.00.RANDOM) for being able to query in time ranges.
I'm thinking of running periodical mapreduce jobs which will go through a designated time period. I might want to analyze the data only between 2009.01 and 2009.02.
I had done this previously with HBase however I thought cassandra would be a better choice for continuously storing data in a safe manner.
I guess this briefly explains my designated use case.
It's technically possible but 0.6 does not support this, no.
What is the use case you are thinking of?
On Thu, Apr 29, 2010 at 11:14 AM, Utku Can Topçu <email@example.com> wrote:
> I've been trying to use Cassandra for some kind of a supplementary input
> source for Hadoop MapReduce jobs.
> The default usage of the ColumnFamilyInputFormat does a full columnfamily
> scan for using within the MapReduce framework as map input.
> However I believe that, it should be possible to give a keyrange to scan the
> specified range.
> Is it anymeans possible?
> Best Regards,
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support