cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-5515) Track sstable coldness
Date Wed, 25 Sep 2013 22:11:05 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tyler Hobbs updated CASSANDRA-5515:
-----------------------------------

    Attachment: 5515-2.0-v1.txt

Attached patch 5515-2.0-v1.txt (and [branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-5515])
adds a custom Meter-like class and tracks 15m and 2h average read rates for sstables, persisting
them in a new table, system.sstable_activity.

Unfortunately, I had to make an entirely new class for the the meter and EWMA due to various
difficulties in subclassing/reusing those directly from Metrics.  I based the classes on the
[3.0.1 Meter|https://github.com/codahale/metrics/blob/v3.0.1/metrics-core/src/main/java/com/codahale/metrics/Meter.java]
and [EWMA|https://github.com/codahale/metrics/blob/v3.0.1/metrics-core/src/main/java/com/codahale/metrics/EWMA.java]
classes, although I simplified them somewhat.  (I tried to push some changes upstream to make
this easier in the future, but it looks like it won't make much of a difference.)

My main uncertainty with this patch is where to hook into to delete rows from system.sstable_activity
when an sstable is deleted, so any suggestions there would be appreciated.

Also, I wasn't sure how configurable we wanted to make this in terms of the tick interval
(5 seconds), the persistence interval (5 minutes), and throttling (100 writes/sec), so all
of those are hard-coded for now.  I think those values are reasonable, but it's hard to guess
about.
                
> Track sstable coldness
> ----------------------
>
>                 Key: CASSANDRA-5515
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5515
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>             Fix For: 2.0.2
>
>         Attachments: 0001-Track-row-read-counts-in-SSTR.patch, 5515-2.0-v1.txt
>
>
> Keeping a count of reads per-sstable would allow STCS to automatically ignore cold data
rather than recompacting it constantly with hot data, dramatically reducing compaction load
for typical time series applications and others with time-correlated access patterns.  We
would not need a separate age-tiered compaction strategy.
> (This will really be useful in conjunction with CASSANDRA-5514.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message