cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dwight Smith <>
Subject Impact of running major compaction with Size Tiered Compaction - version 1.1.11
Date Fri, 10 Jan 2014 23:21:49 GMT

We have a 6 node cluster in two DCs, Cassandra version 1.1.11, RF=3 in each DC.

The DataStax Documentation says the following:

Initiate a major compaction through nodetool compact< m/docs/1.1/references/nodetool#nodetool-compact<>>.
A major compaction merges all SSTables into one. Though major compaction can free disk space
used by accumulated SSTables, during runtime it temporarily doubles disk space usage and is
I/O and CPU intensive. After running a major compaction, automatic minor compactions are no
longer triggered on a frequent basis. Consequently, you no longer have to manually run major
compactions on a routine basis. Expect read performance to improve immediately following a
major compaction, and then to continually degrade until you invoke the next major compaction.
For this reason, DataStax does not recommend major compaction.

A maintenance procedure has been run ( periodically ) on the nodes in the cluster which performs
repair -pr, flush, compact, then cleanup.

This runs fine for all CFs except one which is very large, with large rows. The entries all
have TTLs specified which are less than gc_grace.

Currently the SSTables are as follows for the xxxx CF, the maintenance just completed after
running for 9+ hours:

   19977911 Dec 27 06:38 xxxx-hf-57288-Data.db

       5817 Dec 27 06:52 xxxx-hf-57304-Data.db

2735747237 Dec 27 06:52 xxxx-hf-57291-Data.db

     718192 Dec 27 06:52 xxxx-hf-57305-Data.db

2581373226 Dec 29 16:48 xxxx-hf-57912-Data.db

  936062446 Jan  9 22:22 xxxx-hf-58875-Data.db

  235463043 Jan 10 05:23 xxxx-hf-58888-Data.db

   60851675 Jan 10 08:33 xxxx-hf-58893-Data.db

   60871570 Jan 10 11:44 xxxx-hf-58898-Data.db

   60537384 Jan 10 14:54 xxxx-hf-58903-Data.db

Min_compaction_threshold is set to 4.

Now for the questions:

1) Given that the DataStax recommendation was not followed - will minor compactions still
be triggered if the major compactions are no longer performed?

2) Would the maintenance steps: repair -pr, flush, and cleanup still be useful?


View raw message