cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-9597) DTCS should consider file SIZE in addition to time windowing
Date Mon, 15 Jun 2015 01:20:00 GMT
Jeff Jirsa created CASSANDRA-9597:
-------------------------------------

             Summary: DTCS should consider file SIZE in addition to time windowing
                 Key: CASSANDRA-9597
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9597
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Jeff Jirsa
            Priority: Minor


DTCS seems to work well for the typical use case - writing data in perfect time order, compacting
recent files, and ignoring older files.

However, there are "normal" operational actions where DTCS will fall behind and is unlikely
to recover.

An example of this is streaming operations (for example, bootstrap or loading data into a
cluster using sstableloader), where lots (tens of thousands) of very small sstables can be
created spanning multiple time buckets. In these case, even if max_sstable_age_days is extended
to allow the older incoming files to be compacted, the selection logic is likely to re-compact
large files with fewer small files over and over, rather than prioritizing selection of max_threshold
smallest files to decrease the number of candidate sstables as quickly as possible.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message