cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Hegerfors (JIRA) <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-9013) Add new option making DTCS unify larger time windows sooner
Date Fri, 20 Mar 2015 11:01:38 GMT
Björn Hegerfors created CASSANDRA-9013:
------------------------------------------

             Summary: Add new option making DTCS unify larger time windows sooner
                 Key: CASSANDRA-9013
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9013
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Björn Hegerfors
            Priority: Minor


In my very long post on CASSANDRA-6602, I mentioned a more aggressive windowing strategy,
which looks for opportunities to compact into larger SSTables sooner. The original approach
was that when we have min_threshold windows of the same size and another one of the same size
appears next to them, those windows (not including the newest addition) merge. This new approach
doesn't wait for a (min_threshold+1)th one. As soon as min_threshold windows of one size are
created, they merge at once. The only exception is the "incoming window", which stays outside
of merging with other windows until it is no longer the incoming window.

This does mean that occasionally more than min_threshold SSTables, not all of similar size
get compacted, intentionally. For example, let's say min_threshold is 4, then if we have 3
windows size 16, 3 windows size 4 and just get a 4th size 1 window that isn't the incoming
window, we immediately merge all of those into a size 64 window. Typically we expect one SSTable
to be in each window with a file size corresponding to the window size in some unit of measure.
So we merge roughly 10 SSTables in that scenario.

These bigger compactions happen rarely, about as often as a similar thing happens in STCS
(on occasion the number of SSTables gets very small). This tweak to DTCS is meant to mimic
that behavior in STCS. It has been observed that DTCS typically has 50% to 100% more SSTables
than STCS, so this is a way to counter that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message