cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Hanna (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-9013) Add new option making DTCS unify larger time windows sooner
Date Thu, 26 Nov 2015 15:37:11 GMT


Jeremy Hanna updated CASSANDRA-9013:
    Labels: dtcs  (was: )

> Add new option making DTCS unify larger time windows sooner
> -----------------------------------------------------------
>                 Key: CASSANDRA-9013
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Björn Hegerfors
>            Assignee: Björn Hegerfors
>            Priority: Minor
>              Labels: dtcs
>         Attachments: cassandra-2.0-CASSANDRA-9013.txt
> In my very long post on CASSANDRA-6602, I mentioned a more aggressive windowing strategy,
which looks for opportunities to compact into larger SSTables sooner. The original approach
was that when we have min_threshold windows of the same size and another one of the same size
appears next to them, those windows (not including the newest addition) merge. This new approach
doesn't wait for a (min_threshold+1)th one. As soon as min_threshold windows of one size are
created, they merge at once. The only exception is the "incoming window", which stays outside
of merging with other windows until it is no longer the incoming window.
> This does mean that occasionally more than min_threshold SSTables, not all of similar
size get compacted, intentionally. For example, let's say min_threshold is 4, then if we have
3 windows size 16, 3 windows size 4 and just get a 4th size 1 window that isn't the incoming
window, we immediately merge all of those into a size 64 window. Typically we expect one SSTable
to be in each window with a file size corresponding to the window size in some unit of measure.
So we merge roughly 10 SSTables in that scenario.
> These bigger compactions happen rarely, about as often as a similar thing happens in
STCS (on occasion the number of SSTables gets very small). This tweak to DTCS is meant to
mimic that behavior in STCS. It has been observed that DTCS typically has 50% to 100% more
SSTables than STCS, so this is a way to counter that.

This message was sent by Atlassian JIRA

View raw message