cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jeff.ji...@crowdstrike.com>
Subject Re: Thousands of pending compactions using STCS
Date Fri, 11 Dec 2015 17:12:02 GMT
There were a few buggy versions in 2.1 (2.1.7, 2.1.8, I believe) that showed this behavior.
The number of pending compactions was artificially high, and not meaningful. As long as they
number of –Data.db sstables remains normal, compaction is keeping up and you’re fine.

- Jeff

From:  Vasileios Vlachos
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, December 11, 2015 at 8:28 AM
To:  "user@cassandra.apache.org"
Subject:  Thousands of pending compactions using STCS

Hello,

We use Nagios and MX4J for the majority of the monitoring we do for Cassandra (version: 2.0.16).
For compactions we hit the following URL:

http://${cassandra_host}:8081/mbean?objectname=org.apache.cassandra.db%3Atype%3DCompactionManager

and check the PendingTasks counter's value. 

We have noticed that occasionally one or more nodes will report back that they have thousands
of pending compactions. We have 11 KS in the cluster and a total of 109 *Data.db files under
/var/lib/cassandra/data which gives approximately 10 SSTables per KS. That makes us think
that having thousands of pending compactions seems unrealistic given the number of SSTables
we seem to have at any given time in each KS/CF directory. The logs show a lot of flush and
compaction activity but we don't think that's unusual. Also, each CF is configured to have
min_compaction_threshold = 2 and max_compaction_threshold = 32. The two screenshots below
show a cluster-wide view of pending compactions. Attached you can find the XML files which
contain the data from the MX4J console.



And this is from the same graph, but I've selected the time period after 14:00 in order to
show what the real compaction activity looks like when not skewed by the incredibly high number
of pending compactions as shown above:


Has anyone else experienced something similar? Is there something else we can do to see if
this is something wrong with our cluster?

Thanks in advance for any help!

Vasilis


Mime
View raw message