cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Hegerfors (JIRA) <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-8360) In DTCS, always compact SSTables in the same time window, even if they are fewer than min_threshold
Date Fri, 21 Nov 2014 17:20:34 GMT
Björn Hegerfors created CASSANDRA-8360:
------------------------------------------

             Summary: In DTCS, always compact SSTables in the same time window, even if they
are fewer than min_threshold
                 Key: CASSANDRA-8360
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8360
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Björn Hegerfors
            Priority: Minor


DTCS uses min_threshold to decide how many time windows of the same size that need to accumulate
before merging into a larger window. The age of an SSTable is determined as its min timestamp,
and it always falls into exactly one of the time windows. If multiple SSTables fall into the
same window, DTCS considers compacting them, but if they are fewer than min_threshold, it
decides not to do it.

When do more than 1 but fewer than min_threshold SSTables end up in the same time window (except
for the current window), you might ask? In the current state, DTCS can spill some extra SSTables
into bigger windows when the previous window wasn't fully compacted, which happens all the
time when the latest window stops being the current one. Also, repairs and hints can put new
SSTables in old windows.

I think, and [~jjordan] agreed in a comment on CASSANDRA-6602, that DTCS should ignore min_threshold
and compact tables in the same windows regardless of how few they are. I guess max_threshold
should still be respected.

[~jjordan] suggested that this should apply to all windows but the current window, where all
the new SSTables end up. That could make sense. I'm not clear on whether compacting many SSTables
at once is more cost efficient or not, when it comes to the very newest and smallest SSTables.
Maybe compacting as soon as 2 SSTables are seen is fine if the initial window size is small
enough? I guess the opposite could be the case too; that the very newest SSTables should be
compacted very many at a time?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message