cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T Jake Luciani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9365) Prioritize compactions based on read activity
Date Tue, 12 May 2015 20:54:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540705#comment-14540705
] 

T Jake Luciani commented on CASSANDRA-9365:
-------------------------------------------

[~jbellis] that's within a table. I'm talking about across tables.

Thought experiment: You have 10 tables and 4 compaction slots.  You write evenly to all tables
but 1 gets 80% of the reads.  You want the one with reads to get the most compaction time.

> Prioritize compactions based on read activity
> ---------------------------------------------
>
>                 Key: CASSANDRA-9365
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9365
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: T Jake Luciani
>             Fix For: 3.x
>
>
> The main purpose of compaction is to keep reads fast by consolidating tables together
to avoid merging on read.
> In a cluster with many tables we currently treat all pending compaction as equal.  When
in reality we may only be reading mainly from one of the tables.
> Rather than FIFO we should prioritize access to the compactors based on read activity.
SStables per read might be a good metric.  Also, we would need to be sure to be fair to other
tables over time.  This would be a way to skew the work towards the tables who need compaction
the most.
> It might also be nice to offer a nodetool command to kill specific compaction jobs in
progress that are not important under load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message