cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Need to maintenance on a cassandra node, are there problems with this process
Date Fri, 19 Aug 2011 04:42:12 GMT
You should get on 0.7.4 while you are doing this, this is a pretty good reason https://github.com/apache/cassandra/blob/cassandra-0.7.8/CHANGES.txt#L58

>  Never done a read repair on this cluster before, is that a problem?
Potentially. 
Repair will ensure that your data is distributed, and that deletes done mysteriously come
back to life http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds
 
Personally I would get a repair to complete before I started this process. 

You may want to make sure everything is compacted as best it can be before hand, see some
of the other threads about repair using a lot of space. 

* use nodetool to change the compaction threshold down to 2 for the CF's
* trigger a minor compaction using nodetool flush
* wait and monitor using nodetool compactionstats

The do a repair, reapir one CF at a time. Starting with the smallest CF. Monitor disk space
and 
nodetool compactionstats 
then
nodetool netstats


If you have the network space I would just move the files and then put them backā€¦.

* drain
* copy the /var/lib/cassandra/data and saved_caches dirs
* copy the yaml 
* blast away
* put things back in  place
* start up and run repair

I know you have RF 3 and 3 nodes. I'm been cautious. If you don't have space the current approach
is fine. 

You may want to disable Hinted Handoff while you are doing this as you are going to run repair
anyway when the node comes back. 

Cheers

  
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 19/08/2011, at 11:57 AM, Anand Somani wrote:

> Hi,
> 
> version - 0.7.4
> cluster size = 3
> RF = 3.
> data size on a node ~500G
> 
> I want to do some disk maintenance on a cassandra node, so the process that I came up
with is
> drain this node
> back up the system data space
> rebuild the disk partition
> copy data from another node
> copy data from the backed up system data
> restart node
> run nodetool repair
> Is this process sane. Never done a read repair on this cluster before, is that a problem?
Should I run it per CF? Would it help if I did this before bringing the node down?
> 
> Any pointers, things to worry about.
> 
> Thanks
> Anand


Mime
View raw message