Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Wed, 16 Jan 2013 16:36:13 +0000 (UTC)
From: "Sylvain Lebresne (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12613371.1351113912075.145883.1358354173537@arcas>
In-Reply-To: <JIRA.12613371.1351113912075@arcas>
References: <JIRA.12613371.1351113912075@arcas>
Subject: [jira] [Updated] (CASSANDRA-4858) Coverage analysis for low-CL
 queries
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/CASSANDRA-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-4858:
----------------------------------------

    Attachment: 4858-v4.txt

bq. On the snitch heuristics, can we just say "if latency(b) > latency(a) + latency(c), then don't merge?"

That's kind of what we want, though we want to do that on the final list of replica that is going to be queried (for the first range, the 2nd one and their union). And because it is currently ReadCallback (created after we've made the decision of merging or not the ranges) that compute that final list of endpoints, I've been too lazy to do that properly. But that was a mistake, so attaching a v4 that refactor things a little more (pulling the filtering of the final endpoints out of ReadCallback basically) to do this ticket properly.

With that, we end up having the dynamic snitch heuristic being something like:
{noformat}
if (maxScore(endpointsFor(A and B)) > maxScore(endpointsFor(A)) + maxScore(endpointsFor(B)))
    // don't merge
{noformat}
It's maxScore since we're dealing with lists of endpoints and it's the max latency that will define the total latency of the read.

I note that this v4 includes Jonathan's cleanup. As commented in that cleanup patch, there is probably simplification to be gained by moving the Table inside ConsistencyLevel, but let's maybe do that in a follow-up ticket?

                
> Coverage analysis for low-CL queries
> ------------------------------------
>
>                 Key: CASSANDRA-4858
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4858
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Vijay
>             Fix For: 1.2.1
>
>         Attachments: 0001-CASSANDRA-4858.patch, 0001-CASSANDRA-4858-v2.patch, 4858-cleanup.txt, 4858-v3-1.txt, 4858-v3-2.txt, 4858-v4.txt
>
>
> There are many cases where getRangeSlice creates more
> RangeSliceCommand than it should, because it always creates one for each range
> returned by getRestrictedRange.  Especially for CL.ONE this does not take
> the replication factor into account and is potentially pretty wasteful.
> A range slice at CL.ONE on a 3 node cluster with RF=3 should only
> ever create one RangeSliceCommand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira