cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikolai Grigoriev <ngrigor...@gmail.com>
Subject Re: coordinator selection in remote DC
Date Thu, 20 Nov 2014 17:22:39 GMT
Hmmm...I am using:

endpoint_snitch: com.datastax.bdp.snitch.DseDelegateSnitch

which is using:

delegated_snitch: org.apache.cassandra.locator.PropertyFileSnitch

(for this specific test cluster)

I did not check the code - is this snitch on by default and, maybe, used as
wrapper for configured endpoint_snitch?

It would explain the difference in the inter-DC traffic for sure. Also it
would not affect the local DC traffic as all nodes are replicas for the
data anyway.


On Thu, Nov 20, 2014 at 12:03 PM, Tyler Hobbs <tyler@datastax.com> wrote:

> The difference is likely due to the DynamicEndpointSnitch (aka dynamic
> snitch), which picks replicas to send messages to based on recently
> observed latency and self-reported load (accounting for compactions,
> repair, etc).  If you want to confirm this, you can disable the dynamic
> snitch by adding this line to cassandra.yaml: "dynamic_snitch: false".
>
> On Thu, Nov 20, 2014 at 9:52 AM, Nikolai Grigoriev <ngrigoriev@gmail.com>
> wrote:
>
>> Hi,
>>
>> There is something odd I have observed when testing a configuration with
>> two DC for the first time. I wanted to do a simple functional test to prove
>> myself (and my pessimistic colleagues ;) ) that it works.
>>
>> I have a test cluster of 6 nodes, 3 in each DC, and a keyspace that is
>> replicated as follows:
>>
>> CREATE KEYSPACE xxxxxxx WITH replication = {
>>
>>   'class': 'NetworkTopologyStrategy',
>>
>>   'DC2': '3',
>>
>>   'DC1': '3'
>>
>> };
>>
>>
>> I have disabled the traffic compression between DCs to get more accurate
>> numbers.
>>
>> I have set up a bunch of IP accounting rules on each node so they count
>> the outgoing traffic from this node to each other node. I had rules for
>> different ports but, of course, but it is mostly about port 7000 (or 7001)
>> when talking about inter-node traffic. Anyway, I have a table that shows
>> the traffic from any node to any node's port 7000.
>>
>> I have ran a test with DCAwareRoundRobinPolicy and the client talking
>> only to DC1 nodes. Everything looks fine - the client has sent identical
>> amount of data to each of 3 nodes in DC1. These nodes inside of DC1 (I was
>> writing with LOCAL_ONE consistency) have sent similar amount of data to
>> each other that represents exactly two extra replicas.
>>
>> However, when I look at the traffic from the nodes in DC1 to the nodes in
>> DC1 the picture is different:
>>
>>   10.3.45.156
>>
>> 10.3.45.159
>>
>> dpt:7000
>>
>> 117,273,075
>>
>> 10.3.45.156
>>
>> 10.3.45.160
>>
>> dpt:7000
>>
>> 228,326,091
>>
>> 10.3.45.156
>>
>> 10.3.45.161
>>
>> dpt:7000
>>
>> 46,924,339
>>
>> 10.3.45.157
>>
>> 10.3.45.159
>>
>> dpt:7000
>>
>> 118,978,269
>>
>> 10.3.45.157
>>
>> 10.3.45.160
>>
>> dpt:7000
>>
>> 230,444,929
>>
>> 10.3.45.157
>>
>> 10.3.45.161
>>
>> dpt:7000
>>
>> 47,394,179
>>
>> 10.3.45.158
>>
>> 10.3.45.159
>>
>> dpt:7000
>>
>> 113,969,248
>>
>> 10.3.45.158
>>
>> 10.3.45.160
>>
>> dpt:7000
>>
>> 225,844,838
>>
>> 10.3.45.158
>>
>> 10.3.45.161
>>
>> dpt:7000
>>
>> 46,338,939
>>
>> Nodes 10.3.45.156-158 are in DC1, .159-161 - in DC2. As you can see, each
>> of nodes in DC1 has sent different amount of traffic to the remote nodes:
>> 117Mb, 228Mb and 46Mb respectively. Both DC have one rack.
>>
>> So, here is my question. How does node select the node in remote DC to
>> send the message to? I did a quick sweep through the code and I could only
>> find the sorting by proximity (checking the rack and DC). So, considering
>> that for each request I fire the targets are all 3 nodes in the remote DC,
>> the list will contain all 3 nodes in DC2. And, if I understood correctly,
>> the first node from the list is picked to send the message.
>>
>> So, it seems to me that there is no any kind of round-robin-type logic is
>> applied when selecting the target node to forward the write to from the
>> list of targets in remote DC.
>>
>> If this is true (and the numbers kind of show it is, right?), then
>> probably the list with equal proximity should be shuffled randomly? Or,
>> instead of picking the first target, a random one should be picked?
>>
>>
>> --
>> Nikolai Grigoriev
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>



-- 
Nikolai Grigoriev
(514) 772-5178

Mime
View raw message