cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandru Sicoe <>
Subject emptying my cluster
Date Mon, 02 Jan 2012 22:00:17 GMT
Hi everyone and Happy New Year!

I need advice for organizing data flow outside of my 3 node Cassandra 0.8.6
cluster. I am configuring my keyspace to use the NetworkTopologyStrategy. I
have 2 data centers each with a replication factor 1 (i.e. DC1:1; DC2:1)
the configuration of the PropertyFileSnitch is:



I assign tokens like this:
                        node1 = 0
                        node2 = 1
                        node3 = 85070591730234615865843651857942052864

My write consistency level is ANY.

My data sources are only inserting data in node1 & node3. Essentially what
happens is that a replica of every input value will end up on node2. Node 2
thus has a copy of the entire data written to the cluster. When Node2
starts getting full, I want to have a script which pulls it off-line and
does a sequence of operations (compaction/snapshotting/exporting/truncating
the CFs) in order to back up the data in a remote place and to free it up
so that it can take more data. When it comes back on-line it will take
hints from the other 2 nodes.

This is how I plan on shipping data out of my cluster without any downtime
or any major performance penalty. The problem is when I want to also
truncate the CFs in node1 & node3 to also free them up of data. I don't
know whether I can do this without any downtime or without any serious
performance penalties. Is anyone using truncate to free up CFs of data? How
efficient is this?

Any observations or suggestions are much appreciated!


View raw message