cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Romain Hardouin <>
Subject Re: How to do cassandra routine maintenance
Date Fri, 08 Sep 2017 20:53:28 GMT
You should read about repair maintenance:
installing and running C* reaper to do so: doesn't work well
with TTL. I saw you have done some tuning, hard to say if it's OK without knowing the workload.LCS
is better for TTL (but requires fast disks) and if you're working with time series consider
TWCS.If CPU are not overloaded you can also consider Snappy compression (btw check compression
ratio).Again depending on your data model and your queries, chunk_length_in_kb might be increased
to have a more effective compression (generally speaking we tend to lower it to improve read
    Le samedi 2 septembre 2017 à 04:17:22 UTC+2, qf zhou <> a écrit
 I am using the cluster with 3 cassandra  nodes, the cluster version is 3.0.9. Each day about
200~300 million records are inserted into the cluster.
As time goes by,  more and more data occupied more and more disk space. Currently,    the
data distribution  on each node is  as  the following:

UN  2.5 TiB    256          66.3%            c5271e74-19a1-4cee-98d7-dc169cf87e95 
UN  1.73 TiB  256          67.0%            c623bbc0-9839-4d2d-8ff3-db7115719d59 
UN  1.86 TiB  256          66.7%            c555e44c-9590-4f45-aea4-f5eca68180b2 

There is only one datacenter.  

The compaciton strategy is here:
    compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '12', 'tombstone_threshold': '0.1', 'unchecked_tombstone_compaction':
    AND compression = {'chunk_length_in_kb': '64', 'class': ''}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 8640000
    AND gc_grace_seconds = 432000

I really want to know  about how to do cassandra routine maintenance ?

I found the data seems to grow faster  and  the disk is in heavy load. Sometimes I found
the data inconsistency: two different results appear with the same query.

So what I shoud I do to keep the cluster healthy,  how to maintain the cluster?

I hope  some help  very much!  Thanks a lot ! 

To unsubscribe, e-mail:
For additional commands, e-mail:
View raw message