cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8635) STCS cold sstable omission does not handle overwrites without reads
Date Wed, 21 Jan 2015 22:58:36 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286481#comment-14286481
] 

Tyler Hobbs commented on CASSANDRA-8635:
----------------------------------------

This definitely seems like it will help many cases, but it seems like there are still a couple
of problematic scenarios:
* What if the cold sstables overlap with each other but not with any hot sstables?
* What if they all overlap by 75% and thus fall below the threshold?

If we go with this strategy, it seems like we still need a safety mechanism to ensure that
number of sstables never blows up, such as a max number of sstables.

After thinking about this for a while, I'm also tempted to scrap the whole don't-compact-cold-sstables
in STCS approach in favor of using DTCS.  (This was implemented prior to DTCS being proposed.)
 Since STCS somewhat randomly mixes data within sstables, the only pattern that is likely
to generate "cold" sstables is when old data is read infrequently, and DTCS addresses many
of those patterns more effectively.  Are there any opinions on this?

> STCS cold sstable omission does not handle overwrites without reads
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-8635
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8635
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Tyler Hobbs
>            Assignee: Marcus Eriksson
>            Priority: Critical
>             Fix For: 2.1.3
>
>         Attachments: 0001-Include-cold-sstables-in-compactions-if-they-overlap.patch
>
>
> In 2.1, STCS may omit cold SSTables from compaction (CASSANDRA-6109).  If data is regularly
overwritten or deleted (but not enough to trigger a single-sstable tombstone purging compaction),
data size on disk may continuously grow if:
> * The table receives very few reads
> * The reads only touch the newest SSTables
> Basically, if the overwritten data is never read and there aren't many tombstones, STCS
has no incentive to compact the sstables.  We should take sstable overlap into consideration
as well as coldness to address this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message