cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gabriele renzi <>
Subject Re: Anyone using hadoop/MapReduce integration currently?
Date Thu, 27 May 2010 06:44:27 GMT
On Tue, May 25, 2010 at 6:35 PM, Jeremy Hanna
<> wrote:

> What is the use case?

we end up with messed up data in the database, we run a mapreduce job
to find irregular data from time to time.

> Why are you using Cassandra versus using data stored in HDFS or HBase?

as of now our mapreduce task is only used for "fixing" cassandra so
the question is useless :)

> Are you using a separate Hadoop cluster to run the MR jobs on, or perhaps are you running
the Job Tracker and Task Trackers on Cassandra nodes?


> Is there anything holding you back from using it (if you would like to use it but currently

It would be nice if the output of the mapreduce job was a
MutationOutputFormat in which we could write insert/delete, but I
recall there is something on jira already albeit not sure if it was

View raw message