incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <>
Subject Re: Unbalanced ring mystery multi-DC issue with 1.1.11
Date Wed, 02 Oct 2013 03:00:15 GMT
Check the logs for messages about nodes going up and down, and also look at the MessagingService
MBean for timeouts. If the node in DR 2 times out replying to DR1 the DR1 node will store
a hint. 

Also when hints are stored they are TTL'd to the gc_grace_seconds for the CF (IIRC). If that's
low the hints may not have been delivered. 

Am not aware of any specific tracking for failed hints other than log messages. 


Aaron Morton
New Zealand

Co-Founder & Principal Consultant
Apache Cassandra Consulting

On 28/09/2013, at 12:01 AM, Oleg Dulin <> wrote:

> Here is some more information.
> I am running full repair on one of the nodes and I am observing strange behavior.
> Both DCs were up during the data load. But repair is reporting a lot of out-of-sync data.
Why would that be ? Is there a way for me to tell that WAN may be dropping hinted handoff
traffic ?
> Regards,
> Oleg
> On 2013-09-27 10:35:34 +0000, Oleg Dulin said:
>> Wanted to add one more thing:
>> I can also tell that the numbers are not consistent across DRs this way -- I have
a column family with really wide rows (a couple million columns).
>> DC1 reports higher column counts than DC2. DC2 only becomes consistent after I do
the command a couple of times and trigger a read-repair. But why would nodetool repair logs
show that everything is in sync ?
>> Regards,
>> Oleg
>> On 2013-09-27 10:23:45 +0000, Oleg Dulin said:
>>> Consider this output from nodetool ring:
>>> Address         DC          Rack        Status State   Load            Effective-Ownership
>>> 127605887595351923798765477786913079396
>>> dc1.5      DC1 	RAC1        Up     Normal  32.07 GB        50.00%       0
>>> dc2.100    DC2 RAC1        Up     Normal  8.21 GB         50.00%        100
>>> dc1.6      DC1 RAC1        Up     Normal  32.82 GB        50.00%        42535295865117307932921825928971026432
>>> dc2.101    DC2 RAC1        Up     Normal  12.41 GB        50.00%        42535295865117307932921825928971026532
>>> dc1.7      DC1 RAC1        Up     Normal  28.37 GB        50.00%        85070591730234615865843651857942052864
>>> dc2.102    DC2 RAC1        Up     Normal  12.27 GB        50.00%        85070591730234615865843651857942052964
>>> dc1.8      DC1 RAC1        Up     Normal  27.34 GB        50.00%        127605887595351923798765477786913079296
>>> dc2.103    DC2 RAC1        Up     Normal  13.46 GB        50.00%        127605887595351923798765477786913079396
>>> I concealed IPs and DC names for confidentiality.
>>> All of the data loading was happening against DC1 at a pretty brisk rate, of,
say, 200K writes per minute.
>>> Note how my tokens are offset by 100. Shouldn't that mean that load on each node
should be roughly identical ? In DC1 it is roughly around 30 G on each node. In DC2 it is
almost 1/3rd of the nearest DC1 node by token range.
>>> To verify that the nodes are in sync, I ran nodetool -h localhost repair MyKeySpace
--partitioner-range on each node in DC2. Watching the logs, I see that the repair went really
quick and all column families are in sync!
>>> I need help making sense of this. Is this because DC1 is not fully compacted
? Is it because DC2 is not fully synced and I am not checking correctly ? How can I tell that
there is still replication going on in progress (note, I started my load yesterday at 9:50am).
> -- 
> Regards,
> Oleg Dulin

View raw message