Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Fri, 11 Oct 2013 23:16:42 +0000 (UTC)
From: "Tyler Hobbs (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12671094.1380382399510.54267.1381533402597@arcas>
In-Reply-To: <JIRA.12671094.1380382399510@arcas>
References: <JIRA.12671094.1380382399510@arcas>
Subject: [jira] [Commented] (CASSANDRA-6109) Consider coldness in STCS
 compaction
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793127#comment-13793127 ] 

Tyler Hobbs commented on CASSANDRA-6109:
----------------------------------------

I've spent some more time thinking about this and it seems like we either need a more sophisticated approach in order to handle the various corner cases or we need to disable this feature by default.

If we disable the feature by default, then using a hotness percentile or something similar might be okay.

If we want to enable the feature by default, I've got a couple of more sophisticated approaches:

The first approach is fairly simple and uses two parameters:
* SSTables which receive less than X% of the reads/sec per key of the hottest sstable (for the whole CF) will be considered cold.
* If the cold sstables make up more than Y% of the total reads/sec, don't consider the warmest of the cold sstables cold. (In other words, go through the "cold" bucket and remove the warmest sstables until the cold bucket makes up less than %Y of the total reads/sec.)

This solves one problem of basing coldness on the mean rate, which is that if you have almost all cold sstables, the mean will be very low.  Comparing against the max deals well with this.  The second parameter acts as a hedge for the case you brought up where a large number of cold sstables can collectively account for a high percentage of the total reads.

The second approach is less hacky but more difficult to explain or tune; it's an bucket optimization measure that covers these concerns.  Ideally, we would optimize two things:
* Average sstable hotness of the bucket
* The percentage of the total CF reads that are included in the bucket

These two items are somewhat in opposition.  Optimizing only for the first measure would mean just compacting the two hottest sstables.  Optimizing only for the second would mean compacting all sstables.  We can combine the two measures with different weightings to get a pretty good bucket optimization measure.  I've played around with some different measures in python and have a script that makes approximately the same bucket choices I would.  However, as I mentioned, this would be pretty hard for operators to understand and tune intelligently, somewhat like phi_convict_threshold.  If you're still open to that, I can attach my script with some example runs.

> Consider coldness in STCS compaction
> ------------------------------------
>
>                 Key: CASSANDRA-6109
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6109
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>             Fix For: 2.0.2
>
>         Attachments: 6109-v1.patch, 6109-v2.patch
>
>
> I see two options:
> # Don't compact cold sstables at all
> # Compact cold sstables only if there is nothing more important to compact
> The latter is better if you have cold data that may become hot again...  but it's confusing if you have a workload such that you can't keep up with *all* compaction, but you can keep up with hot sstable.  (Compaction backlog stat becomes useless since we fall increasingly behind.)


--
This message was sent by Atlassian JIRA
(v6.1#6144)