cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8920) Remove IntervalTree from maxPurgeableTimestamp calculation
Date Wed, 18 Mar 2015 12:30:38 GMT


Marcus Eriksson commented on CASSANDRA-8920:

(adding comment here after discussion on irc)

This would probably be quite a bit slower for LCS since the overlappingSSTables contain the
sstables that overlap the currently compacting ones but are not currently being compacted.
This means that for LCS, this would contain all other sstables on the node when doing a L0
-> L1 compaction.

For STCS this would probably work very well since we would almost always return all sstables
from the interval tree. Perhaps we should let the compaction strategy decide if we should
use the interval tree or not.

> Remove IntervalTree from maxPurgeableTimestamp calculation
> ----------------------------------------------------------
>                 Key: CASSANDRA-8920
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>             Fix For: 2.1.4
>         Attachments: 8920.txt
> The IntervalTree only maps partition keys. Since a majority of users deploy a hashed
partitioner the work is mostly wasted, since they will be evenly distributed across the full
token range owned by the node - and in some cases it is a significant amount of work. We can
perform a corroboration against the file bounds if we get a BF match as a sanity check if
we like, but performing an IntervalTree search is significantly more expensive (esp. once
murmur hash calculation memoization goes mainstream).
> In LCS, the keys are bounded, to it might appear that it would help, but in this scenario
we only compact against like bounds, so again it is not helpful.
> With a ByteOrderedPartitioner it could potentially be of use, but this is sufficiently
rare to not optimise for IMO.

This message was sent by Atlassian JIRA

View raw message