From Aaron Morton <>
Subject Re: How to get the result from the closest node
Date Wed, 27 Oct 2010 00:38:09 GMT
Lets start with the simple case, all nodes have the same proximity to each other.

The client connects to a random node, called the coordinator. When a request is made the coordinator asynchronously sends
it to all nodes that are a replica for the requested key. It waits for the response, and in
the best case of a read is able to return the data to the client as soon as CL nodes respond. 

The client does not have any knowledge of where the data is located in the cluster. Thats
the job of the coordinator and it takes only 1 hop to get to each replica. 

The consistency check is between the full data returned from one node and a digest returned
from the others. If it fails then RR will kick in (under CL > ONE, if CL ONE is probabilistic). 

Read about the RackAwareStrategy now NetworkTopologyStrategy discussed here|(strategy) and
the Snitch. Also|(replication) These
features let you tell cassandra about your topology.

You will then want to use something like DCQUORUM or DCQUORUMSYNC (0.7+ AFAIK) for your

Hope that helps.

On 27 Oct, 2010,at 11:58 AM, Joe Alex <> wrote:


I have Cassandra 0.6.6 running on 4 nodes with RF=2.

Let say nodes A, B, C, D

If I have clients A1, B1, C1, D1 connected to respective nodes what
happens when A1 requests A for a key "100" for which D is responsible
as per the Token. C has the second copy.
As per the logs A1 requests A which requests D and gets the data. D
also checks a consistency check in the background on C.
If I have RF=3 I assume D will do 2 consistency checks.

If I need to get the data from A itself with minimum latency and
network traversal between Data centers is this what I need to do ?

1. maybe RF=4 or at least >= 3
2. Adjust Read Consistency (ONE, QUORUM, DCQUORUM...)
3. Use RackAware strategy with DCQUORUM
3. Adjust Write Consistency

Is there a way to get/write the data from the closest node - example A
is in NY, D in London etc.
For above example key=100. A1 calls A and A gets the data all the way from D
Also when A1 writes key=100 data needs to be written in D and C by A

Probably need RF=4 for this in combination with DCQUORUM or ANY/ONE ?
Want to know how everybody is approaching this cases ?

DEBUG [pool-1-thread-21] 2010-10-26 18:29:25,231
(line 216) get_slice
DEBUG [pool-1-thread-21] 2010-10-26 18:29:25,231
(line 386) weakread reading SliceFromReadCommand(table='Keyspace1',
key='100', column_parent='QueryPath(columnFamilyName='Standard2',
superColumnName='null', columnName='null')', start='', finish='',
reversed=true, count=1000000) from 1311748@/
DEBUG [RESPONSE-STAGE:2] 2010-10-26 18:29:25,234
ResponseVerbHandlerjava (line 52) Processing response on an async
result from 1311748@/
DEBUG [Timer-1] 2010-10-26 18:29:26,511 (line
36) Disseminating load info ...

DEBUG [ROW-READ-STAGE:5] 2010-10-26 18:29:19,415
(line 116) collecting middle:false:1@1288128381467000
DEBUG [ROW-READ-STAGE:5] 2010-10-26 18:29:19,415
(line 116) collecting last:false:3@1288128369639000
DEBUG [ROW-READ-STAGE:5] 2010-10-26 18:29:19,415 SliceQueryFilterjava
(line 116) collecting first:false:4@1288128358062000
DEBUG [ROW-READ-STAGE:5] 2010-10-26 18:29:19,415
(line 93) Read key 100; sending response to 1311748@/
DEBUG [CONSISTENCY-MANAGER:4] 2010-10-26 18:29:19,416 (line 73) Reading consistency digest for 100
from 1081388@[/, /]
DEBUG [RESPONSE-STAGE:1] 2010-10-26 18:29:19,418 (line 42) Processing response on a callback
from 1081388@/10.21032.93

DEBUG [ROW-READ-STAGE:4] 2010-10-26 18:29:25,237
(line 116) collecting middle:false:1@1288128381467000
DEBUG [ROW-READ-STAGE:4] 2010-10-26 18:29:25,238
(line 116) collecting last:false:3@1288128369639000
DEBUG [ROW-READ-STAGE:4] 2010-10-26 18:29:25,238
(line 116) collecting first:false:4@1288128358062000
DEBUG [ROW-READ-STAGE:4] 2010-10-26 18:29:25,238
(line 75) digest is c1ba97c56693d7fe4cbb9ac0544034b3
DEBUG [ROW-READ-STAGE:4] 2010-10-26 18:29:25,238
(line 93) Read key 100; sending response to 1081388@/

