cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jji...@gmail.com>
Subject Re: Questions on time series use case, tombstones, TWCS
Date Wed, 09 Aug 2017 14:27:37 GMT
The deleting compaction strategy from protectwise (https://github.com/protectwise/cassandra-util/blob/master/deleting-compaction-strategy/README.md)
was written (I believe) to solve a similar problem - business based deletion rules to enable
flexible TTLs. May want to glance at that.

Other answers inline below 


-- 
Jeff Jirsa


> On Aug 9, 2017, at 1:41 AM, Steinmaurer, Thomas <thomas.steinmaurer@dynatrace.com>
wrote:
> 
> Hello,
>  
> our top contributor from a data volume perspective is time series data. We are running
with STCS since our initial production deployment in 2014 with several clusters with a varying
number of nodes, but currently with max. 9 nodes per single cluster per different region in
AWS with m4.xlarge / EBS gp2 storage. We have a road of Cassandra versions starting with 1.2
to actually using DSC 2.1.15 soon to be replaced by Apache Cassandra 2.1.18 across all deployments.
Lately we switched from Thrift (Astyanax) to Native/CQL (DataStax driver). Overall we are
pretty happy with stability and the scale out offering.
>  
> We store time series data in different resolutions, from 1min up to 1day aggregates per
“time slot”. Each resolution has its own column family / table and a periodic worker is
executing our business logic regarding time series aging from e.g. 1min => 5min => …
resolution + deletion in source resolutions according to our retention per resolution policy.
So deletions will happen way later (e.g. at least > 14d). We don’t use TTLs on written
time series data (in production, see TWCS testing below), so purging is exclusively handled
by explicit DELETEs in our aging business logic creating tombstones.
>  
> Naturally with STCS and late explicit deletions / tombstones, it will take a lot of time
to finally reclaim disk space, even worse, we are now running a major compaction every X weeks.
We currently are also testing with STCS min_threshold = 2 etc., but all in all, this all feels
not being a long-term solution. I guess there is nothing else we are missing from a configuration/setting
side with STCS? Single SSTable compaction might not kick in as well, cause checking with sstablemeta,
estimated droppable tombstones for our time series based SSTables is pretty much 0.0 all the
time. I guess as we don’t write with TTL?


Or you aren't issuing deletes, explicit deletes past GCGS will cause that number to increase

>  
> TWCS caught my eyes in 2015 I think, and even more at the Cassandra Summit 2016 + other
Tombstone related talks. Cassandra 3.0 is around 6 months ahead for us, thus initial testing
was with 2.1.18 patched with TWCS from GitHub.
>  
> Looks like TWCS is exactly what we need, thus test-wise, once we start writing with TTL
we end up with a single SSTable per passed window size and data (SSTables) older than TTL
+ grace get automatically removed from disk. Even with enabled out-of-orders DELETEs from
our business logic, purging SSTables seems not be stucked. Not sure if this is expected. Writing
with TTL is also a bit problematic, in case our retention policy changes in general or for
particular customers.

Search for my Cassandra summit talk from 2016 - there's a few other compaction options you
probably want to set to more aggressively trigger single sstable compaction to help unstick
it.

>  
> A few questions, as we need some short-term (with C* 2.1) and long-term (with C* 3.0)
mitigation:
> ·         With STCS, estimated droppable tombstones being always 0.0 (thus also no automatic
single SSTable compaction may happen): Is this a matter of not writing with TTL? If yes, would
enabling TTL with STCS improve the disk reclaim situation, cause then single SSTAble compactions
will kick in?
> ·         What is the semantic of “default_time_to_live” at table level? From: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html
: “After the default_time_to_live TTL value has been exceed, Cassandra tombstones the entire
table”. What does “entire table” mean?

It probably means sstable, but even that isn't really accurate - that's a doc bug 

> Hopefully / I guess I don’t end up with an empty table every X past TTLs?
> ·         Anything else I’m missing regarding STCS and reclaiming disk space earlier
in our TS use case?

LCS rewrites much more aggressively on partition updates - if you can spare the IO it's likely
going to be more efficient purging deleted data than STCS 

> ·         I know, changing compaction is a matter of executing ALTER TABLE (or temporary
via JMX for a single node), but as we have legacy data being written without TTL, I wonder
if we may end up in stuck SSTable again
> ·         In case of stuck SSTables with any compaction strategy, what is the best way
to debug/analyze why it got stuck (overlapping etc.)?

sstableexpiredblockers

>  
> Thanks a lot and sorry for the lengthy email.
>  
> Thomas
> The contents of this e-mail are intended for the named addressee only. It contains information
that may be confidential. Unless you are the named addressee or an authorized designee, you
may not copy or use it, or disclose it to anyone else. If you received it in error please
notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN
91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria,
Freistädterstraße 313

Mime
View raw message