incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Khangaonkar <>
Subject Cassandra & MapReduce/Storm/ etc
Date Thu, 08 May 2014 21:43:38 GMT

Searching for Cassandra with MapReduce, I am finding that the search
results are really dated -- from version 0.7 & 2010/2011.

Is there a good blog/article that describes how using MapReduce on
Cassandra table ?

>From my naive understanding, Cassandra is all about partitioning. Querying
is based on partitionkey + clustered column(s).

Inputs to MapReduce is a sequence of Key,values. For Storm it is a stream
of tuples.

If a database table is input source for MapReduce or Storm, for me , this
is in the simple case, is translating to a full table scan of the input
table, which can timeout and is generally not a recommended access pattern
in Cassandra.

My initial reaction is that if I need to process data with MapReduce or
Storm, reading it from Cassandra might not be the optimal way. Storing the
output to Cassandra however does make sense.

If anyone had links to blogs or personal experience in this area, I would
appreciate if you can share it.


View raw message