Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Thu, 5 Mar 2015 13:49:38 +0000 (UTC)
From: "Benedict (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12779792.1425563356000.10215.1425563378113@Atlassian.JIRA>
In-Reply-To: <JIRA.12779792.1425563356000@Atlassian.JIRA>
References: <JIRA.12779792.1425563356000@Atlassian.JIRA>
 <JIRA.12779792.1425563356757@arcas>
Subject: [jira] [Created] (CASSANDRA-8920) Remove IntervalTree from
 maxPurgeableTimestamp calculation
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Benedict created CASSANDRA-8920:
-----------------------------------

             Summary: Remove IntervalTree from maxPurgeableTimestamp calculation
                 Key: CASSANDRA-8920
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8920
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Benedict
            Priority: Minor


The IntervalTree only maps partition keys. Since a majority of users deploy a hashed partitioner the work is mostly wasted, since they will be evenly distributed across the full token range owned by the node - and in some cases it is a significant amount of work. We can perform a corroboration against the file bounds if we get a BF match as a sanity check if we like, but performing an IntervalTree search is significantly more expensive (esp. once murmur hash calculation memoization goes mainstream).

In LCS, the keys are bounded, to it might appear that it would help, but in this scenario we only compact against like bounds, so again it is not helpful.

With a ByteOrderedPartitioner it could potentially be of use, but this is sufficiently rare to not optimise for IMO.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)