cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1873) Read Repair behavior thwards DynamicEndpointSnitch at CL.ONE
Date Fri, 17 Dec 2010 00:25:03 GMT


Jonathan Ellis commented on CASSANDRA-1873:

Note: IMO it is okay to break RR temporarily when upgrading a cluster piecemeal -- that is,
it's okay for RR to not happen; it's not okay to generate internal errors.

> Read Repair behavior thwards DynamicEndpointSnitch at CL.ONE
> ------------------------------------------------------------
>                 Key: CASSANDRA-1873
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.6.9, 0.7.1
> When doing a CL.ONE read, the coordinator node selects the data node from the list of
replicas via snitch sortByProximity.  The data node (_not_ the coordinator) then sends digest
requests to the remaining replicas, and compares their answers to its own (in ConsistencyChecker).
> This means that, in a multi-datacenter situation, for any given range R with replicas
X in dc1 and Y in dc2, the only node with latency information for Y will be X.  Since DES
falls back to subsnitch (static) order when latency information is missing for any replica
it is asked to sort, DES will be unable to direct requests to Y no matter how overwhelmed
X becomes.
> To fix this, we should move the digest-checking code into the coordinator node (probably
starting with the 0.7 ConsistencyChecker, which represents a cleanup of the 0.6 one).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message