cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Corentin Chary (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-13418) Allow TWCS to ignore overlaps
Date Wed, 05 Apr 2017 21:17:41 GMT
Corentin Chary created CASSANDRA-13418:
------------------------------------------

             Summary: Allow TWCS to ignore overlaps
                 Key: CASSANDRA-13418
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
             Project: Cassandra
          Issue Type: Improvement
          Components: Compaction
            Reporter: Corentin Chary


http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If you really want
read-repairs you're going to have sstables blocking the expiration of other fully expired
SSTables because they overlap.

You can set unchecked_tombstone_compaction = true or tombstone_threshold to a very low value
and that will purge the blockers of old data that should already have expired, thus removing
the overlaps and allowing the other SSTables to expire.

The thing is that this is rather CPU intensive and not optimal. If you have time series, you
might not care if all your data doesn't exactly expire at the right time, or if data re-appears
for some time, as long as it gets deleted as soon as it can. And in this situation I believe
it would be really beneficial to allow users to simply ignore overlapping SSTables when looking
for fully expired ones.

To the question: why would you need read-repairs ?
- Full repairs basically take longer than the TTL of the data on my dataset, so this isn't
really effective.
- Even with a 10% chances of doing a repair, we found out that this would be enough to greatly
reduce entropy of the most used data (and if you have timeseries, you're likely to have a
dashboard doing the same important queries over and over again).
- LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.

I'll try to come up with a patch demonstrating how this would work, try it on our system and
report the effects.

cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message