We run it concurrently each RF nodes (If RF = 3, we run it on 3 waves). If the node is busy cleaning up, then the client will time out and ask to an other node having a copy of the data and that is not being cleaned up. 

"Will node tool cleanup consume lot of IO and CPU even though there is nothing to clean"

Yes, I think so, since you have to check that you have nothing to clean... I think there is no case that need a regular cleanup anyway.


2013/6/12 Michal Michalski <michalm@opera.com>
What will happen if I add nodetool cleanup to run periodically (similar to nodetool repair) ? Will node tool cleanup consume lot of IO and CPU even though there is nothing to clean ?

Why would you need doing so?

M.



Thank you
Emalayan


________________________________
  From: Robert Coli <rcoli@eventbrite.com>
To: user@cassandra.apache.org; Emalayan Vairavanathan <svemalayan@yahoo.com>
Sent: Monday, 10 June 2013 5:15 PM
Subject: Re: [Cassandra] Expanding a Cassandra cluster


On Mon, Jun 10, 2013 at 3:13 PM, Emalayan Vairavanathan
<svemalayan@yahoo.com> wrote:
I suspect that nodetool cleanup is IO intensive. So running nodetool cleanup
concurrently on the entire cluster may have a significantly impact the IO
performance of applications.

cleanup is a specific kind of compaction, and as such respects the
compaction throughput throttle.

The compaction throughput throttle is designed to prevent compaction
from negatively impacting the performance of things-not-compaction. If
you notice that cleanup compaction on all or most nodes consumes too
much i/o, reduce the throttle value.

=Rob