cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Ricardo Motta Gomes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6563) TTL histogram compactions not triggered at high "Estimated droppable tombstones" rate
Date Fri, 16 May 2014 20:29:17 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000258#comment-14000258
] 

Paulo Ricardo Motta Gomes commented on CASSANDRA-6563:
------------------------------------------------------

Below I will present some live cluster analysis about 10 days after deploying the original
patch that entirely removes the check for range overlap in worthDroppingTombstones(). 

*Analysis Description*

In our dataset we use both LCS and STCS, but most of the CFs are STCS. A significant portion
of our dataset is comprised of append-only TTL-ed data, so a good match for tombstone compaction.
Most of our large CFs with high droppable tombstone ratio use STCS, but there are a few that
use LCS that also benefited from the patch. 

I deployed the patch in 2 different ranges with similar results. The metrics were collected
between 1st of May and 16 of May, the nodes were patched on the 7th of May. Used cassandra
version was 1.2.16.
  
In the analysis I compare the total space used (Cassandra Load), tombstone Ratio, disk utilization
(system disk xvbd util), total bytes compacted and system load (linux cpu). For the last three
metrics I also calculate the integral of the metric to make it easier to compare the total
amount during the period.

*Analysis*

Graphs: https://issues.apache.org/jira/secure/attachment/12645241/patch-v1-range1.png

Each graph compares the metrics of the patched node with it's previous neighbor and next neighbor,
no VNODES is used. So, the first row in the figure is node N-1, the second row is node N (the
patched node, marked with asterisk), and the third row is node N+1.

* *Cassandra load*: In the patched node, it's possible to see a sudden decrease of 7% of disk
space when the patch was applied, due to the execution of single SSTable compactions. The
growth rate of disk usage is also decreased after the patch, since tombstone are cleared more
often. In the whole period, there was a 1.2% disk space increase in the patched node, against
about 10% growth on the unpatched nodes.

* *Tombstone ratio*: After the patch is applied, it's possible to see a decrease in the droppable
tombstone ratio, that revolves around the default level of 20% after that. The droppable tombstone
ratio of unpatched nodes remains high for most CFs, what indicates that tombstone compactions
are not being triggered at all.

* *Disk utilization*: it's not possible to detect any change in the disk utilization pattern
after the patch is applied, what might indicate the I/O is not affected by the patch, at least
for our mixed dataset. I double checked the IOPS graph for the period and there was not even
a slight sign of change in the I/O pattern after the patch was applied. (https://issues.apache.org/jira/secure/attachment/12645312/patch-v1-iostat.png)

* *Total Bytes compacted*: The number of compacted bytes in the patched node was about 17%
higher in the period. About 7% due to the initial tombstones that were cleared and more 7%
due to cleared tombstones after the patch was applied (the difference between the 2 nodes
sizes). The remaining 3% can be attributed to unnecessary compactions + normal variations
because of different node ranges.

* *System CPU Load*: Was not affected by the patch.

*Alternative Patch*

I implemented another version of the patch (v2) as suggested by [~krummas], that instead of
dropping the overlap check entirely, it only performs the check for SSTables containing rows
with smaller timestamp than the candidate SSTable (https://issues.apache.org/jira/secure/attachment/12645316/1.2.16-CASSANDRA-6563-v2.txt).


One week ago I deployed this alternative patch on 2 of our production nodes, and unfortunately
loosing the checks did not achieve significant results. I added some debugging log to the
code and what I verified is that despite reducing the number of sstables to compare with,
even if only one SSTable has a column with an equal or lower timestamp to the candidate SSTable,
the token ranges of these sstables always overlap because of the Random Partitioner. So, this
supports the claim that even with loosen checks, the single-sstable tombstone compaction is
almost never being triggered. At least on the use cases that could benefit from it.

The graphs for the alternative patch analysis can be found here: https://issues.apache.org/jira/secure/attachment/12645240/patch-v2-range3.png

> TTL histogram compactions not triggered at high "Estimated droppable tombstones" rate
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6563
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6563
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 1.2.12ish
>            Reporter: Chris Burroughs
>            Assignee: Paulo Ricardo Motta Gomes
>             Fix For: 1.2.17, 2.0.8
>
>         Attachments: 1.2.16-CASSANDRA-6563-v2.txt, 1.2.16-CASSANDRA-6563.txt, 2.0.7-CASSANDRA-6563.txt,
patch-v1-iostat.png, patch-v1-range1.png, patch-v2-range3.png, patched-droppadble-ratio.png,
patched-storage-load.png, patched1-compacted-bytes.png, patched2-compacted-bytes.png, unpatched-droppable-ratio.png,
unpatched-storage-load.png, unpatched1-compacted-bytes.png, unpatched2-compacted-bytes.png
>
>
> I have several column families in a largish cluster where virtually all columns are written
with a (usually the same) TTL.  My understanding of CASSANDRA-3442 is that sstables that have
a high ( > 20%) estimated percentage of droppable tombstones should be individually compacted.
 This does not appear to be occurring with size tired compaction.
> Example from one node:
> {noformat}
> $ ll /data/sstables/data/ks/Cf/*Data.db
> -rw-rw-r-- 31 cassandra cassandra 26651211757 Nov 26 22:59 /data/sstables/data/ks/Cf/ks-Cf-ic-295562-Data.db
> -rw-rw-r-- 31 cassandra cassandra  6272641818 Nov 27 02:51 /data/sstables/data/ks/Cf/ks-Cf-ic-296121-Data.db
> -rw-rw-r-- 31 cassandra cassandra  1814691996 Dec  4 21:50 /data/sstables/data/ks/Cf/ks-Cf-ic-320449-Data.db
> -rw-rw-r-- 30 cassandra cassandra 10909061157 Dec 11 17:31 /data/sstables/data/ks/Cf/ks-Cf-ic-340318-Data.db
> -rw-rw-r-- 29 cassandra cassandra   459508942 Dec 12 10:37 /data/sstables/data/ks/Cf/ks-Cf-ic-342259-Data.db
> -rw-rw-r--  1 cassandra cassandra      336908 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342307-Data.db
> -rw-rw-r--  1 cassandra cassandra     2063935 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342309-Data.db
> -rw-rw-r--  1 cassandra cassandra         409 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342314-Data.db
> -rw-rw-r--  1 cassandra cassandra    31180007 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342319-Data.db
> -rw-rw-r--  1 cassandra cassandra     2398345 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342322-Data.db
> -rw-rw-r--  1 cassandra cassandra       21095 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342331-Data.db
> -rw-rw-r--  1 cassandra cassandra       81454 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342335-Data.db
> -rw-rw-r--  1 cassandra cassandra     1063718 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342339-Data.db
> -rw-rw-r--  1 cassandra cassandra      127004 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342344-Data.db
> -rw-rw-r--  1 cassandra cassandra      146785 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342346-Data.db
> -rw-rw-r--  1 cassandra cassandra      697338 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342351-Data.db
> -rw-rw-r--  1 cassandra cassandra     3921428 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342367-Data.db
> -rw-rw-r--  1 cassandra cassandra      240332 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342370-Data.db
> -rw-rw-r--  1 cassandra cassandra       45669 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342374-Data.db
> -rw-rw-r--  1 cassandra cassandra    53127549 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342375-Data.db
> -rw-rw-r-- 16 cassandra cassandra 12466853166 Dec 25 22:40 /data/sstables/data/ks/Cf/ks-Cf-ic-396473-Data.db
> -rw-rw-r-- 12 cassandra cassandra  3903237198 Dec 29 19:42 /data/sstables/data/ks/Cf/ks-Cf-ic-408926-Data.db
> -rw-rw-r--  7 cassandra cassandra  3692260987 Jan  3 08:25 /data/sstables/data/ks/Cf/ks-Cf-ic-427733-Data.db
> -rw-rw-r--  4 cassandra cassandra  3971403602 Jan  6 20:50 /data/sstables/data/ks/Cf/ks-Cf-ic-437537-Data.db
> -rw-rw-r--  3 cassandra cassandra  1007832224 Jan  7 15:19 /data/sstables/data/ks/Cf/ks-Cf-ic-440331-Data.db
> -rw-rw-r--  2 cassandra cassandra   896132537 Jan  8 11:05 /data/sstables/data/ks/Cf/ks-Cf-ic-447740-Data.db
> -rw-rw-r--  1 cassandra cassandra   963039096 Jan  9 04:59 /data/sstables/data/ks/Cf/ks-Cf-ic-449425-Data.db
> -rw-rw-r--  1 cassandra cassandra   232168351 Jan  9 10:14 /data/sstables/data/ks/Cf/ks-Cf-ic-450287-Data.db
> -rw-rw-r--  1 cassandra cassandra    73126319 Jan  9 11:28 /data/sstables/data/ks/Cf/ks-Cf-ic-450307-Data.db
> -rw-rw-r--  1 cassandra cassandra    40921916 Jan  9 12:08 /data/sstables/data/ks/Cf/ks-Cf-ic-450336-Data.db
> -rw-rw-r--  1 cassandra cassandra    60881193 Jan  9 12:23 /data/sstables/data/ks/Cf/ks-Cf-ic-450341-Data.db
> -rw-rw-r--  1 cassandra cassandra        4746 Jan  9 12:23 /data/sstables/data/ks/Cf/ks-Cf-ic-450350-Data.db
> -rw-rw-r--  1 cassandra cassandra        5769 Jan  9 12:23 /data/sstables/data/ks/Cf/ks-Cf-ic-450352-Data.db
> {noformat}
> {noformat}
> 295562: Estimated droppable tombstones: 0.899035828535183
> 296121: Estimated droppable tombstones: 0.9135080937806197
> 320449: Estimated droppable tombstones: 0.8916766879896414
> {noformat}
> I've checked in on this example node several times and compactionstats has not shown
any other activity that would be blocking the tombstone based compaction.  The TTL is in the
15-20 day range so an sstable from November should have had ample opportunities by January.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message