incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Cherian <>
Subject Cassandra for Ad-hoc Aggregation and formula calculation
Date Fri, 10 Dec 2010 21:42:53 GMT

I have been reading up on Cassandra for the past few weeks and I am
highly impressed by the features it offers. At work, we are starting
work on a product that will handle several million CDR (Call Data
Record, basically can be thought of as a .CSV file) per day. We will
have to store the data, and perform aggregations and calculations on
them. A few veteran RDBMS admin friends (we are a small .NET shop, we
don't have any in-house DB talent) recommended Infobright and noSQL to
us, and hence my search. I was wondering if Cassandra is a good fit

1. Storing several million data records per day (each record will be a
few KB in size) without any data loss.
2. Aggregation of certain fields in the stored records, like Avg
across time period.
3. Using certain existing fields to calculate new values on the fly
and store it too.
4. We were wondering if pre-aggregation was a good choice (calculating
aggregation per 1 min, 5 min, 15 min etc ahead of time) but in case we
need ad-hoc aggregation, does Cassandra support that over this amount
of data?


View raw message