cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Hegerfors (JIRA) <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-8243) DTCS can leave time-overlaps, limiting ability to expire entire SSTables
Date Mon, 03 Nov 2014 00:20:33 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Björn Hegerfors updated CASSANDRA-8243:
---------------------------------------
    Attachment: cassandra-trunk-CASSANDRA-8243-aggressiveTTLExpiry.txt

I've made a simple change in the getFullyExpiredSSTables method (removing one line did the
trick) which drops an SSTable as long as it is fully expired and has getMaxTimestamp less
than the getMinTimestamp of any (overlapping) SSTable which contains any column that's still
alive. The difference between this condition and the previous is subtle, but to my understanding,
the old condition was being unnecessarily cautious. This one should be safe and it will certainly
solve this issue.

But of course, if this is wrong, then that could be a serious bug. So this has to be carefully
reviewed.

> DTCS can leave time-overlaps, limiting ability to expire entire SSTables
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8243
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8243
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Björn Hegerfors
>            Assignee: Björn Hegerfors
>            Priority: Minor
>              Labels: compaction, performance
>             Fix For: 2.0.12, 2.1.2
>
>         Attachments: cassandra-trunk-CASSANDRA-8243-aggressiveTTLExpiry.txt
>
>
> CASSANDRA-6602 (DTCS) and CASSANDRA-5228 are supposed to be a perfect match for tables
where every value is written with a TTL. DTCS makes sure to keep old data separate from new
data. So shortly after the TTL has passed, Cassandra should be able to throw away the whole
SSTable containing a given data point.
> CASSANDRA-5228 deletes the very oldest SSTables, and only if they don't overlap (in terms
of timestamps) with another SSTable which cannot be deleted.
> DTCS however, can't guarantee that SSTables won't overlap (again, in terms of timestamps).
In a test that I ran, every single SSTable overlapped with its nearest neighbors by a very
tiny amount. My reasoning for why this could happen is that the dumped memtables were already
overlapping from the start. DTCS will never create an overlap where there is none. I surmised
that this happened in my case because I sent parallel writes which must have come out of order.
This was just locally, and out of order writes should be much more common non-locally.
> That means that the SSTable removal optimization may never get a chance to kick in!
> I can see two solutions:
> 1. Make DTCS split SSTables on time window borders. This will essentially only be done
on a newly dumped memtable once every base_time_seconds.
> 2. Make TTL SSTable expiry more aggressive. Relax the conditions on which an SSTable
can be dropped completely, of course without affecting any semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message