cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Benevides <lu...@maurobenevides.com.br>
Subject Re: Cassandra Compaction Metrics - CompletedTasks vs TotalCompactionCompleted
Date Wed, 01 Nov 2017 13:07:10 GMT
Thanks a lot Chris,

I had noticed that even the counter in the TotalCompactionsCompleted is
higher than the number of SSTables compactions, that is what interests me
most. I measured the number of compactions turning on the log_all in the
compaction settings in the tables and reading the compaction.log data (in
Json format).

These info you gave will be very useful to me. Hope it can get into the
documentation.

Lucas Benevides

2017-10-31 14:56 GMT-02:00 Chris Lohfink <clohfink85@gmail.com>:

> CompactionMetrics is a combination of the compaction executor (sstable
> compactions, secondary index build, view building, relocate,
> garbagecollect, cleanup, scrub etc) and validation executor (repairs). Keep
> in mind not all jobs execute 1 task per operation, things that use the
> parallelAllSSTableOperation like cleanup will create 1 task per sstable.
>
> The "CompletedTasks" metric is a measure of how many tasks ran on these
> two executors combined.
> The "TotalCompactionsCompleted" metric is a measure of how many
> compactions issued from the compaction manager ran (normal compactions,
> cache writes, scrub, 2i and MVs).  So while they may be close, depending on
> whats happening on the system, theres no assurance that they will be within
> any bounds of each other.
>
> So I would suspect validation compactions from repairs would be one major
> difference. If you run other operational tasks there will likely be more.
>
>
> On Mon, Oct 30, 2017 at 12:22 PM, Lucas Benevides <
> lucas@maurobenevides.com.br> wrote:
>
>> Kurt,
>>
>> I apreciate your answer but I don't believe CompletedTasks count the
>> "validation compactions". These are compactions that occur from repair
>> operations. I am running tests on 10 cluster nodes in the same physical
>> rack, with Cassandra Stress Tool and I didn't make any Repair commands. The
>> tables only last for seven hours, so it is not reasonable that tens of
>> thousands of these validation compactions occur per node.
>>
>> I tried to see the code and the CompletedTasks counter seems to be
>> populated by a method from the class java.util.concurrent.Thr
>> eadPoolExecutor.
>> So I really don't know what it is but surely is not the amount of
>> Compaction Completed Tasks.
>>
>> Thank you
>> Lucas Benevides
>>
>>    -
>>
>>
>> 2017-10-30 8:05 GMT-02:00 kurt greaves <kurt@instaclustr.com>:
>>
>>> I believe (may be wrong) that CompletedTasks counts Validation
>>> compactions while TotalCompactionsCompleted does not. Considering a lot of
>>> validation compactions can be created every repair it might explain the
>>> difference. I'm not sure why they are named that way or work the way they
>>> do. There appears to be no documentation around this in the code (what a
>>> surprise) and looks like it was last touched in CASSANDRA-4009
>>> <https://issues.apache.org/jira/browse/CASSANDRA-4009>, which also has
>>> no useful info.
>>>
>>> On 27 October 2017 at 13:48, Lucas Benevides <
>>> lucas@maurobenevides.com.br> wrote:
>>>
>>>> Dear community,
>>>>
>>>> I am studying the behaviour of the Cassandra
>>>> TimeWindowCompactionStragegy. To do so I am watching some metrics. Two of
>>>> these metrics are important: Compaction.CompletedTasks, a gauge, and the
>>>> TotalCompactionsCompleted, a Meter.
>>>>
>>>> According to the documentation (http://cassandra.apache.org/d
>>>> oc/latest/operating/metrics.html#table-metrics):
>>>> Completed Taks = Number of completed compactions since server [re]start.
>>>> TotalCompactionsCompleted = Throughput of completed compactions since
>>>> server [re]start.
>>>>
>>>> As I realized, the TotalCompactionsCompleted, in the Meter object, has
>>>> a counter, which I supposed would be numerically close to the
>>>> CompletedTasks gauge. But they are very different, with the Completed Tasks
>>>> being much higher than the TotalCompactions Completed.
>>>>
>>>> According to the code, in github (class metrics.CompactionMetrics.java
>>>> ):
>>>> Completed Taks - Number of completed compactions since server [re]start
>>>> TotalCompactionsCompleted - Total number of compactions since server
>>>> [re]start
>>>>
>>>> Can you help me and explain the difference between these two metrics,
>>>> as they seem to have very distinct values, with the Completed Tasks being
>>>> around 1000 times the value of the counter in
>>>> TotalCompactionsCompleted.
>>>>
>>>> Thanks in Advance,
>>>> Lucas Benevides
>>>>
>>>>
>>>
>>
>

Mime
View raw message