incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Caprari <>
Subject Cassandra and hadoop?
Date Tue, 16 Mar 2010 10:16:54 GMT

I've tried the mapreduce example in 0.6 contrib/wordcount and it
worked very well.

I have a shallow understanding of both worlds, so pardon my questions:

Is the integration with hadoop just 'semantic' (ie map/reduce api is
only used as query abstraction) or is
it 'structural' (ie cassandra can 'talk to hadoop' and replace HDFS as
input source)?

In practice:
- If I want to run a distributed mapreduce job on cassandra, does my
cassandra cluster have to be an hadoop cluster as well?
- do I get data locality optimization: I reckon cassandra can in
principle figure out where it is best to execute a
but to do so it should take over some of the responsibilities of
hadoop's jobtracker. Does it?

:Matteo Caprari

View raw message