cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <>
Subject Re: Adding Nodes With Inconsistent Data
Date Wed, 24 Jun 2015 20:58:43 GMT
It looks to me that can indeed happen theoretically (I might be wrong).


- Hinted Handoff tends to remove this issue, if this is big worry, you
might want to make sure HH are enabled and well tuned
- Read Repairs (synchronous or not) might have mitigate things also, if you
read fresh data. You can set this to higher values.
- After an outage, you should always run a nodetool repair on the node that
went done - following the best practices, or because you understand the
reasons - or just trust HH if it is enough to you.

So I would say that you can always "shoot yourself in your foot", whatever
you do, yet following best practices or understanding the internals is the
key imho.

I would say it is a good question though.


2015-06-24 19:43 GMT+02:00 Anuj Wadehra <>:

> Hi,
> We faced a scenario where we lost little data after adding 2 nodes in the
> cluster. There were intermittent dropped mutations in the cluster. Need to
> verify my understanding how this may have happened to do Root Cause
> Analysis:
> Scenario: 3 nodes, RF=3, Read / Write CL= Quorum
> 1. Due to overloaded cluster, some writes just happened on 2 nodes: node 1
> & node 2 whike asynchronous mutations dropped on node 3.
> So say key K with Token T was not written to 3.
> 2. I added node 4 and suppose as per newly calculated ranges, now token T
> is supposed to have replicas on node 1, node 3, and node 4. Unfortunately
> node 4 started bootstrapping from node 3 where key K was missing.
> 3. After 2 min gap recommended, I added node 5 and as per new token
> distribution suppose token T now is suppossed to have replicas on node 3,
> node 4 and node 5. Again node 5 bootstrapped from node 3 where data was
> misssing.
> So now key K is lost and thats how we list very few rows.
> Moreover, in step 1 situation could be worse. we can also have a scenario
> where some writes just happened on one of three replicas and cassandra
> chooses  replicas where this data is missing for streaming ranges to 2 new
> nodes.
> Am I making sense?
> We are using C* 2.0.3.
> Thanks
> Anuj
> Sent from Yahoo Mail on Android
> <>

View raw message