cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Dahlke <>
Subject Re: Map/Reduce over Cassandra
Date Wed, 18 Aug 2010 13:03:47 GMT
Hey Bill,

A few months ago we did an experiment with 5 hadoop nodes pulling from
4 cass nodes. It was pulling down 1 column family with 8 small columns
& just dumping the raw data to hdfs. It was cycling through around 17K
map tasks per sec. The machines weren't being taxed too hard, so I'm
sure there's some concurrency tuning we could have done to speed that
up. Unfortunately we don't have that same data on HDFS yet, so I can't
really give a direct comparison.

Hope that helps. I'm curious what others have seen as well.

On Tue, Aug 17, 2010 at 6:59 PM, Bill Hastings <> wrote:
> Hi All
> How performant is M/R on Cassandra when compared to running it on HDFS?
> Anyone have any numbers they can share? Specifically how much of data the
> M/R job was run against and what was the throughput etc. Any information
> would be very helpful.
> --
> Cheers
> Bill

View raw message