cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Léo FERLIN SUTTON <lfer...@mailjet.com.INVALID>
Subject Re: Tombstones not getting purged
Date Thu, 20 Jun 2019 15:33:32 GMT
Thank you for the information !

On Thu, Jun 20, 2019 at 9:50 AM Alexander Dejanovski <alex@thelastpickle.com>
wrote:

> Léo,
>
> if a major compaction isn't a viable option, you can give a go at
> Instaclustr SSTables tools to target the partitions with the most
> tombstones :
> https://github.com/instaclustr/cassandra-sstable-tools/tree/cassandra-2.2#ic-purge
>
> It generates a report like this:
>
> Summary:
>
> +---------+---------+
>
> |         | Size    |
>
> +---------+---------+
>
> | Disk    |  1.9 GB |
>
> | Reclaim | 11.7 MB |
>
> +---------+---------+
>
>
> Largest reclaimable partitions:
>
> +--------------+--------+---------+-----------------+
>
> | Key          | Size   | Reclaim | Generations     |
>
> +--------------+--------+---------+-----------------+
>
> | 001.2.340862 | 3.2 kB |  3.2 kB | [534, 438, 498] |
>
> | 001.2.946243 | 2.9 kB |  2.8 kB | [534, 434, 384] |
>
> | 001.1.527557 | 2.8 kB |  2.7 kB | [534, 519, 394] |
>
> | 001.2.181797 | 2.6 kB |  2.6 kB | [534, 424, 343] |
>
> | 001.3.475853 | 2.7 kB |    28 B |      [524, 462] |
>
> | 001.0.159704 | 2.7 kB |    28 B |      [440, 247] |
>
> | 001.1.311372 | 2.6 kB |    28 B |      [424, 458] |
>
> | 001.0.756293 | 2.6 kB |    28 B |      [428, 358] |
>
> | 001.2.681009 | 2.5 kB |    28 B |      [440, 241] |
>
> | 001.2.474773 | 2.5 kB |    28 B |      [524, 484] |
>
> | 001.2.974571 | 2.5 kB |    28 B |      [386, 517] |
>
> | 001.0.143176 | 2.5 kB |    28 B |      [518, 368] |
>
> | 001.1.185198 | 2.5 kB |    28 B |      [517, 386] |
>
> | 001.3.503517 | 2.5 kB |    28 B |      [426, 346] |
>
> | 001.1.847384 | 2.5 kB |    28 B |      [436, 396] |
>
> | 001.0.949269 | 2.5 kB |    28 B |      [516, 356] |
>
> | 001.0.756763 | 2.5 kB |    28 B |      [440, 249] |
>
> | 001.3.973808 | 2.5 kB |    28 B |      [517, 386] |
>
> | 001.0.312718 | 2.4 kB |    28 B |      [524, 467] |
>
> | 001.3.632066 | 2.4 kB |    28 B |      [432, 377] |
>
> | 001.1.946590 | 2.4 kB |    28 B |      [519, 389] |
>
> | 001.1.798591 | 2.4 kB |    28 B |      [434, 388] |
>
> | 001.3.953922 | 2.4 kB |    28 B |      [432, 375] |
>
> | 001.2.585518 | 2.4 kB |    28 B |      [432, 375] |
>
> | 001.3.284942 | 2.4 kB |    28 B |      [376, 432] |
>
> +--------------+--------+---------+-----------------+
>
> Once you've identified these partitions you can run a compaction on the
> SSTables that contain them (identified using "nodetool getsstables").
> Note that user defined compactions are only available for STCS.
> Also ic-purge will perform a compaction but without writing to disk
> (should look like a validation compaction), so it is rightfully reported by
> the docs as an "intensive process" (not more than a repair though).
>
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
> On Thu, Jun 20, 2019 at 9:17 AM Alexander Dejanovski <
> alex@thelastpickle.com> wrote:
>
>> My bad on date formatting, it should have been : %Y/%m/%d
>> Otherwise the SSTables aren't ordered properly.
>>
>> You have 2 SSTables that claim to cover timestamps from 1940 to 2262,
>> which is weird.
>> Aside from that, you have big overlaps all over the SSTables, so that's
>> probably why your tombstones are sticking around.
>>
>> Your best shot here will be a major compaction of that table, since it
>> doesn't seem so big. Remember to use the --split-output flag on the
>> compaction command to avoid ending up with a single SSTable after that.
>>
>> Cheers,
>>
>> -----------------
>> Alexander Dejanovski
>> France
>> @alexanderdeja
>>
>> Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>>
>> On Thu, Jun 20, 2019 at 8:13 AM Léo FERLIN SUTTON
>> <lferlin@mailjet.com.invalid> wrote:
>>
>>> On Thu, Jun 20, 2019 at 7:37 AM Alexander Dejanovski <
>>> alex@thelastpickle.com> wrote:
>>>
>>>> Hi Leo,
>>>>
>>>> The overlapping SSTables are indeed the most probable cause as
>>>> suggested by Jeff.
>>>> Do you know if the tombstone compactions actually triggered? (did the
>>>> SSTables name change?)
>>>>
>>>
>>> Hello !
>>>
>>> I believe they have changed. I do not remember the sstable name but the
>>> "last modified" has changed recently for these tables.
>>>
>>>
>>>> Could you run the following command to list SSTables and provide us the
>>>> output? It will display both their timestamp ranges along with the
>>>> estimated droppable tombstones ratio.
>>>>
>>>>
>>>> for f in *Data.db; do meta=$(sstablemetadata -gc_grace_seconds 259200
>>>> $f); echo $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" "
>>>> -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') $(date --date=@$(echo "$meta" |
>>>> grep Minimum\ time | cut -d" "  -f3| cut -c 1-10) '+%m/%d/%Y
>>>> %H:%M:%S') $(echo "$meta" | grep droppable) $(ls -lh $f); done | sort
>>>>
>>>
>>> Here is the results :
>>>
>>> ```
>>> 04/01/2019 22:53:12 03/06/2018 16:46:13 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 16G Apr 13 14:35 md-147916-big-Data.db
>>> 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 218M Jun 20 05:57 md-167948-big-Data.db
>>> 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 2.2G Jun 20 05:57 md-167942-big-Data.db
>>> 05/01/2019 08:03:24 03/06/2018 16:46:13 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 4.6G May 1 08:39 md-152253-big-Data.db
>>> 05/09/2018 06:35:03 03/06/2018 16:46:07 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 30G Apr 13 22:09 md-147948-big-Data.db
>>> 05/21/2019 05:28:01 03/06/2018 16:46:16 Estimated droppable tombstones:
>>> 0.45150604672159905 -rw-r--r-- 1 cassandra cassandra 1.1G Jun 20 05:55
>>> md-167943-big-Data.db
>>> 05/22/2019 11:54:33 03/06/2018 16:46:16 Estimated droppable tombstones:
>>> 0.30826566640798975 -rw-r--r-- 1 cassandra cassandra 7.6G Jun 20 04:35
>>> md-167913-big-Data.db
>>> 06/13/2019 00:02:40 03/06/2018 16:46:08 Estimated droppable tombstones:
>>> 0.20980847354256815 -rw-r--r-- 1 cassandra cassandra 6.9G Jun 20 04:51
>>> md-167917-big-Data.db
>>> 06/17/2019 05:56:12 06/16/2019 20:33:52 Estimated droppable tombstones:
>>> 0.6114260192855792 -rw-r--r-- 1 cassandra cassandra 257M Jun 20 05:29
>>> md-167938-big-Data.db
>>> 06/18/2019 11:21:55 03/06/2018 17:48:22 Estimated droppable tombstones:
>>> 0.18655813086540254 -rw-r--r-- 1 cassandra cassandra 2.2G Jun 20 05:52
>>> md-167940-big-Data.db
>>> 06/19/2019 16:53:04 06/18/2019 11:22:04 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 425M Jun 19 17:08 md-167782-big-Data.db
>>> 06/20/2019 04:17:22 06/19/2019 16:53:04 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 146M Jun 20 04:18 md-167921-big-Data.db
>>> 06/20/2019 05:50:23 06/20/2019 04:17:32 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 42M Jun 20 05:56 md-167946-big-Data.db
>>> 06/20/2019 05:56:03 06/20/2019 05:50:32 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 2 cassandra cassandra 4.8M Jun 20 05:56 md-167947-big-Data.db
>>> 07/03/2018 17:26:54 03/06/2018 16:46:07 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 27G Apr 13 17:45 md-147919-big-Data.db
>>> 09/09/2018 18:55:23 03/06/2018 16:46:08 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 30G Apr 13 18:57 md-147926-big-Data.db
>>> 11/30/2018 11:52:33 03/06/2018 16:46:08 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 14G Apr 13 13:53 md-147908-big-Data.db
>>> 12/20/2018 07:30:03 03/06/2018 16:46:08 Estimated droppable tombstones:
>>> 0.0 -rw-r--r-- 1 cassandra cassandra 9.3G Apr 13 13:28 md-147906-big-Data.db
>>> ```
>>>
>>> You could also check the min and max tokens in each SSTable (not sure if
>>>> you get that info from 3.0 sstablemetadata) so that you can detect the
>>>> SSTables that overlap on token ranges with the ones that carry the
>>>> tombstones, and have earlier timestamps. This way you'll be able to trigger
>>>> manual compactions, targeting those specific SSTables.
>>>>
>>>
>>> I have checked and I don't believe the info is available in the 3.0.X
>>> version of sstablemetadata :(
>>>
>>>
>>>> The rule for a tombstone to be purged is that there is no SSTable
>>>> outside the compaction that would possibly contain the partition and that
>>>> would have older timestamps.
>>>>
>>>  Is there a way to log these checks and decisions made by the compaction
>>> thread ?
>>>
>>>
>>>> Is this a followup on your previous issue where you were trying to
>>>> perform a major compaction on an LCS table?
>>>>
>>>
>>> In some way.
>>>
>>> We are trying to globally reclaim the data used up by our tombstones (on
>>> more than one table). We have recently started to purge old data in our
>>> cassandra cluster, and since (on cloud providers) `Disk space isn't cheap`
>>> we are trying to be sure the data correctly expires and the disk space is
>>> reclaimed !
>>>
>>> The major compaction on the LCS table was one of our unsuccessful
>>> attempts (too long and too much disk space used, so abandoned), and we are
>>> currently trying to tweak the compaction parameters to speed things up.
>>>
>>> Regards.
>>>
>>> Leo
>>>
>>> On Thu, Jun 20, 2019 at 7:02 AM Jeff Jirsa <jjirsa@gmail.com> wrote:
>>>>
>>>>> Probably overlapping sstables
>>>>>
>>>>> Which compaction strategy?
>>>>>
>>>>>
>>>>> > On Jun 19, 2019, at 9:51 PM, Léo FERLIN SUTTON
>>>>> <lferlin@mailjet.com.invalid> wrote:
>>>>> >
>>>>> > I have used the following command to check if I had droppable
>>>>> tombstones :
>>>>> > `/usr/bin/sstablemetadata --gc_grace_seconds 259200
>>>>> /var/lib/cassandra/data/stats/tablename/md-sstablename-big-Data.db`
>>>>> >
>>>>> > I checked every sstable in a loop and had 4 sstables with droppable
>>>>> tombstones :
>>>>> >
>>>>> > ```
>>>>> > Estimated droppable tombstones: 0.1558453651124074
>>>>> > Estimated droppable tombstones: 0.20980847354256815
>>>>> > Estimated droppable tombstones: 0.30826566640798975
>>>>> > Estimated droppable tombstones: 0.45150604672159905
>>>>> > ```
>>>>> >
>>>>> > I changed my compaction configuration this morning (via JMX) to
>>>>> force a tombstone compaction. These are my settings on this node :
>>>>> >
>>>>> > ```
>>>>> > {
>>>>> > "max_threshold":"32",
>>>>> > "min_threshold":"4",
>>>>> > "unchecked_tombstone_compaction":"true",
>>>>> > "tombstone_threshold":"0.1",
>>>>> >
>>>>> "class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy"
>>>>> > }
>>>>> > ```
>>>>> > The threshold is lowed than the amount of tombstones in these
>>>>> sstables and I expected the setting `unchecked_tombstone_compaction=True`
>>>>> would force cassandra to run a "Tombstone Compaction", yet about 24h
later
>>>>> all the tombstones are still there.
>>>>> >
>>>>> > ## About the cluster :
>>>>> >
>>>>> > The compaction backlog is clear and here are our cassandra settings
>>>>> :
>>>>> >
>>>>> > Cassandra 3.0.18
>>>>> > concurrent_compactors: 4
>>>>> > compaction_throughput_mb_per_sec: 150
>>>>> > sstable_preemptive_open_interval_in_mb: 50
>>>>> > memtable_flush_writers: 4
>>>>> >
>>>>> >
>>>>> > Any idea what I might be missing ?
>>>>> >
>>>>> > Regards,
>>>>> >
>>>>> > Leo
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>
>>>>>

Mime
View raw message