cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Or Yanay>
Subject Map-Reduce on top of cassandra
Date Mon, 14 Mar 2011 15:06:05 GMT
Hi All,

I am trying to write some map-reduce tasks so I can find out stuff like - how many records
have X status?
I am using 0.7.0 and have 5 nodes with ~100G of data on each node.

I have written the code based on the word_count example and the map-reduce is running successfully
BUT is extremely slow (about 2 hours for the simplest key count).

I am now looking to track down the slowness and tune my process, or explore alternative ways
to achieve the same goal.

Can anyone point me to a way to tune my map-reduce job?
Does anyone have any experience exploring Cassandra data with Hadoop cluster configuration?
( As suggested in


View raw message