cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Plush (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8359) Make DTCS consider removing SSTables much more frequently
Date Tue, 31 Mar 2015 21:28:54 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389451#comment-14389451
] 

Jim Plush commented on CASSANDRA-8359:
--------------------------------------

would love to see this, at high scale PB+ clusters the additional disk space used by completely
expired SSTables is considerable. E.G. I have a table with a 1 day TTL and I have SSTables
on disk with DTCS from 5+ days ago and the max timestamp from the sstable metadata is 5+ days
ago. that should just be an rm -f type operation. I assumed this was how DTCS was supposed
to work. 

> Make DTCS consider removing SSTables much more frequently
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-8359
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8359
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Björn Hegerfors
>            Assignee: Björn Hegerfors
>            Priority: Minor
>         Attachments: cassandra-2.0-CASSANDRA-8359.txt
>
>
> When I run DTCS on a table where every value has a TTL (always the same TTL), SSTables
are completely expired, but still stay on disk for much longer than they need to. I've applied
CASSANDRA-8243, but it doesn't make an apparent difference (probably because the subject SSTables
are purged via compaction anyway, if not by directly dropping them).
> Disk size graphs show clearly that tombstones are only removed when the oldest SSTable
participates in compaction. In the long run, size on disk continually grows bigger. This should
not have to happen. It should easily be able to stay constant, thanks to DTCS separating the
expired data from the rest.
> I think checks for whether SSTables can be dropped should happen much more frequently.
This is something that probably only needs to be tweaked for DTCS, but perhaps there's a more
general place to put this. Anyway, my thinking is that DTCS should, on every call to getNextBackgroundTask,
check which SSTables can be dropped. It would be something like a call to CompactionController.getFullyExpiredSSTables
with all non-compactingSSTables sent in as "compacting" and all other SSTables sent in as
"overlapping". The returned SSTables, if any, are then added to whichever set of SSTables
that DTCS decides to compact. Then before the compaction happens, Cassandra is going to make
another call to CompactionController.getFullyExpiredSSTables, where it will see that it can
just drop them.
> This approach has a bit of redundancy in that it needs to call CompactionController.getFullyExpiredSSTables
twice. To avoid that, the code path for deciding SSTables to drop would have to be changed.
> (Side tracking a little here: I'm also thinking that tombstone compactions could be considered
more often in DTCS. Maybe even some kind of multi-SSTable tombstone compaction involving the
oldest couple of SSTables...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message