cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gabriele renzi <rff....@gmail.com>
Subject Re: Anyone using hadoop/MapReduce integration currently?
Date Thu, 27 May 2010 06:44:27 GMT
On Tue, May 25, 2010 at 6:35 PM, Jeremy Hanna
<jeremy.hanna1234@gmail.com> wrote:


> What is the use case?

we end up with messed up data in the database, we run a mapreduce job
to find irregular data from time to time.


> Why are you using Cassandra versus using data stored in HDFS or HBase?

as of now our mapreduce task is only used for "fixing" cassandra so
the question is useless :)


> Are you using a separate Hadoop cluster to run the MR jobs on, or perhaps are you running
the Job Tracker and Task Trackers on Cassandra nodes?

separate

> Is there anything holding you back from using it (if you would like to use it but currently
cannot)?

It would be nice if the output of the mapreduce job was a
MutationOutputFormat in which we could write insert/delete, but I
recall there is something on jira already albeit not sure if it was
merged.

Mime
View raw message