Arthur,

Yes, my use case for this Cassandra cluster is analytics. I am building a google dapper (application tracing) like system. I collect application traces and write them to Cassandra. Then, I have periodic rollup tasks that read the data, do some summarization and write it back.

Thoughts on how to manage a write heavy cluster?

Thanks,
Carl


On Thu, Aug 1, 2013 at 11:28 AM, Arthur Zubarev <Arthur.Zubarev@aol.com> wrote:
Hi Carl,
 
The ‘repair’ is for data reads. Compaction will take care of the expired data.
 
The fact a repair runs long makes me think the nodes receive unbalanced amounts of writes rather.
 
Regards,
 
Arthur
 
Sent: Thursday, August 01, 2013 12:35 PM
Subject: How often to run `nodetool repair`
 
Hello,
 
I read in the docs that `nodetool repair` should be regularly run unless no delete is ever performed. In my app, I never delete, but I heavily use the ttl feature. Should repair still be run regularly? Also, does repair take less time if it is run regularly? If not, is there a way to incrementally run it? It seems that when I do run repair, it takes a long time and causes high amounts CPU usage and iowait.
 
Thoughts?
 
Thanks,
Carl