cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: TTL tombstones in Cassandra using LCS are cretaed in the same level data TTLed data?
Date Thu, 27 Sep 2018 17:11:05 GMT
Hello Gabriel,

Another clue to explore would be to use the TTL as a default value if
> that's a good fit. TTLs set at the table level with 'default_time_to_live'
> should not generate any tombstone at all in C*3.0+. Not tested on my hand,
> but I read about this.
>

As explained on a parallel thread, this is wrong ^, mea culpa. I believe
the rest of my comment still stands (hopefully :)).

I'm not sure what it means with "*in-place*" since SSTables are immutable.
> [...]

 My guess is that is referring to tombstones being created in the same
> level (but different SStables) that the TTLed data during a compaction
> triggered


Yes, I believe during the next compaction following the expiration date,
the entry is 'transformed' into a tombstone, and lives in the SSTable that
is the result of the compaction, on the level/bucket this SSTable is put
into. That's why I said 'in-place' which is indeed a bit weird for
immutable data.

As a side idea for your problem, on 'modern' versions of Cassandra (I don't
remember the version, that's what 'modern' means ;-)), you can run
'nodetool garbagecollect' regularly (not necessarily frequently) during the
off-peak period. That might use the cluster resources when you don't need
them to claim some disk space. Also making sure that a 2 years old record
is not being updated regularly by design would definitely help. In the
extreme case of writing a data once (never updated) and with a TTL for
example, I see no reason for a 2 years old data not to be evicted
correctly. As long as the disk can grow, it should be fine.

I would not be too much scared about it, as there is 'always' a way to
remove tombstones. Yet it's good to think about the design beforehand
indeed, generally, it's good if you can rotate the partitions over time,
not to reuse old partitions for example.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le mar. 25 sept. 2018 à 17:38, Gabriel Giussi <gabrielgiussi@gmail.com> a
écrit :

> I'm using LCS and a relatively large TTL of 2 years for all inserted rows
> and I'm concerned about the moment at wich C* would drop the corresponding
> tombstones (neither explicit deletes nor updates are being performed).
>
> From [Missing Manual for Leveled Compaction Strategy](
> https://www.youtube.com/watch?v=-5sNVvL8RwI), [Tombstone Compactions in
> Cassandra](https://www.youtube.com/watch?v=pher-9jqqC4) and [Deletes
> Without Tombstones or TTLs](https://www.youtube.com/watch?v=BhGkSnBZgJA)
> I understand that
>
>  - All levels except L0 contain non-overlapping SSTables, but a partition
> key may be present in one SSTable in each level (aka distributed in all
> levels).
>  - For a compaction to be able to drop a tombstone it must be sure that is
> compacting all SStables that contains de data to prevent zombie data (this
> is done checking bloom filters). It also considers gc_grace_seconds
>
> So, for my particular use case (2 years TTL and write heavy load) I can
> conclude that TTLed data will be in highest levels so I'm wondering when
> those SSTables with TTLed data will be compacted with the SSTables that
> contains the corresponding SSTables.
> The main question will be: **Where are tombstones (from ttls) being
> created? Are being created at Level 0 so it will take a long time until it
> will end up in the highest levels (hence disk space will take long time to
> be freed)?**
>
> In a comment from [About deletes and tombstones](
> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html)
> Alain says that
> > Yet using TTLs helps, it reduces the chances of having data being
> fragmented between SSTables that will not be compacted together any time
> soon. Using any compaction strategy, if the delete comes relatively late in
> the row history, as it use to happen, the 'upsert'/'insert' of the
> tombstone will go to a new SSTable. It might take time for this tombstone
> to get to the right compaction "bucket" (with the rest of the row) and for
> Cassandra to be able to finally free space.
> **My understanding is that with TTLs the tombstones is created in-place**,
> thus it is often and for many reasons easier and safer to get rid of a TTLs
> than from a delete.
> Another clue to explore would be to use the TTL as a default value if
> that's a good fit. TTLs set at the table level with 'default_time_to_live'
> should not generate any tombstone at all in C*3.0+. Not tested on my hand,
> but I read about this.
>
> I'm not sure what it means with "*in-place*" since SSTables are
> immutable.
> (I also have some doubts about what it says of using
> `default_time_to_live` that I've asked in [How default_time_to_live would
> delete rows without tombstones in Cassandra?](
> https://stackoverflow.com/q/52282517/3517383)).
>
> My guess is that is referring to tombstones being created in the same
> level (but different SStables) that the TTLed data during a compaction
> triggered by one of the following reasons:
>
>  1. "Going from highest level, any level having score higher than 1.001
> can be picked by a compaction thread" [The Missing Manual for Leveled
> Compaction Strategy](
> https://image.slidesharecdn.com/csummit16lcstalk-161004232416/95/the-missing-manual-for-leveled-compaction-strategy-wei-deng-ryan-svihla-datastax-cassandra-summit-2016-12-638.jpg?cb=1475693117
> )
>  2. "If we go 25 rounds without compacting in the highest level, we start
> bringing in sstables from that level into lower level compactions" [The
> Missing Manual for Leveled Compaction Strategy](
> https://image.slidesharecdn.com/csummit16lcstalk-161004232416/95/the-missing-manual-for-leveled-compaction-strategy-wei-deng-ryan-svihla-datastax-cassandra-summit-2016-12-638.jpg?cb=1475693117
> )
>  3. "When there are no other compactions to do, we trigger a
> single-sstable compaction if there is more than X% droppable tombstones in
> the sstable." [CASSANDRA-7019](
> https://issues.apache.org/jira/browse/CASSANDRA-7019)
> Since tombstones are created during compaction, I think it may be using
> SSTable metadata to estimate droppable tombstones.
>
> **So, compactions (2) and (3) should be creating/dropping tombstones in
> highest levels hence using LCS with a large TTL should not be an issue per
> se.**
> With creating/dropping I mean that the same kind of compactions will be
> creating tombstones for expired data and/or dropping tombstones if the gc
> period has already passed.
>
> A link to source code that clarifies this situation will be great, thanks.
>

Mime
View raw message