incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Nodetool cleanup
Date Fri, 29 Nov 2013 01:59:42 GMT
> I hope I get this right :) 
Thanks for contributing :)

> a repair will trigger a mayor compaction on your node which will take up a lot of CPU
and IO performance. It needs to do this to build up the data structure that is used for the
repair. After the compaction this is streamed to the different nodes in order to repair them.

It does not trigger a major compaction, that’s what we call running compaction on the command
line and compacting all SSTables into one big one. 

it will flush all the data to disk that will create some additional compaction. 

The major concern is that s a disk IO intensive operation, it reads all the data and writes
data to new SSTables (a one to one mapping). If you have all nodes doing this at the same
time there may be some degraded performance. And as it’s all nodes it’s not possible for
the Dynamic Snitch to avoid nodes if they are overloaded.

Cleanup is less intensive than repair, but it’s still a good idea to stagger it. If you
need to run it on all machines (or you have very powerful machines) it’s probably going
to be OK. 
 
Hope that helps. 

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 26/11/2013, at 5:14 am, Artur Kronenberg <artur.kronenberg@openmarket.com> wrote:

> Hi Julien,
> 
> I hope I get this right :) 
> 
> a repair will trigger a mayor compaction on your node which will take up a lot of CPU
and IO performance. It needs to do this to build up the data structure that is used for the
repair. After the compaction this is streamed to the different nodes in order to repair them.

> 
> If you trigger this on every node simultaneously you basically take the performance away
from your cluster. I would expect cassandra still to function, just way slower then before.
Triggering it node after node will leave your cluster with more resources to handle incoming
requests. 
> 
> 
> Cheers,
> 
> Artur 
> On 25/11/13 15:12, Julien Campan wrote:
>> Hi,
>> 
>> I'm working with Cassandra 1.2.2 and I have a question about nodetool cleanup. 
>> In the documentation , it's writted " Wait for cleanup to complete on one node before
doing the next"
>> 
>> I would like to know, why we can't perform a lot of cleanup in a same time ? 
>> 
>> 
>> Thanks
>> 
>> 
> 


Mime
View raw message