cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steinmaurer, Thomas" <thomas.steinmau...@dynatrace.com>
Subject RE: Drop TTLd rows: upgradesstables -a or scrub?
Date Tue, 11 Sep 2018 09:07:34 GMT
Alex,

a single (largish) SSTable or any other SSTable for a table, which does not get any writes
(with e.g. deletes) anymore, will most likely not be part of an automatic minor compaction
anymore, thus may stay forever on disk, if I don’t miss anything crucial here. Might be
different though, if you are entirely writing TTL-based, cause single SSTable based automatic
tombstone compaction may kick in here, but I’m not really experienced with that.

We had been suffering a lot with storing timeseries data with STCS and disk capacity to have
the cluster working smoothly and automatic minor compactions kicking out aged timeseries data
according to our retention policies in the business logic. TWCS is unfortunately not an option
for us. So, we did run major compactions every X weeks to reclaim disk space, thus from an
operational perspective, by far not nice. Thus, finally decided to change STCS min_threshold
from default 4 to 2, to let minor compactions kick in more frequently. We can live with the
additional IO/CPU this is causing, thus is our current approach to disk space and sizing issues
we had in the past.

Thomas

From: Oleksandr Shulgin <oleksandr.shulgin@zalando.de>
Sent: Dienstag, 11. September 2018 09:47
To: User <user@cassandra.apache.org>
Subject: Re: Drop TTLd rows: upgradesstables -a or scrub?

On Tue, Sep 11, 2018 at 9:31 AM Steinmaurer, Thomas <thomas.steinmaurer@dynatrace.com<mailto:thomas.steinmaurer@dynatrace.com>>
wrote:
As far as I remember, in newer Cassandra versions, with STCS, nodetool compact offers a ‘-s’
command-line option to split the output into files with 50%, 25% … in size, thus in this
case, not a single largish SSTable anymore. By default, without -s, it is a single SSTable
though.

Thanks Thomas, I've also spotted the option while testing this approach.  I understand that
doing major compactions is generally not recommended, but do you see any real drawback of
having a single SSTable file in case we stopped writing new data to the table?

--
Alex

The contents of this e-mail are intended for the named addressee only. It contains information
that may be confidential. Unless you are the named addressee or an authorized designee, you
may not copy or use it, or disclose it to anyone else. If you received it in error please
notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN
91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria,
Freistädterstraße 313
Mime
View raw message