cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alok Dwivedi <>
Subject Re: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees
Date Wed, 01 May 2019 23:31:05 GMT
Cassandra-2434 is ensuring that when we add new node, it streams data from a source that it
will replace, once the data has been completely streamed. This is explained in detail in the
blog post you shared. This ensures that one continues to get same consistency as it was before
new node was added. So if new node D now owns data for token range that originally was owned
by replicas A, B & C, then this fix ensures that if D streams from A then A no longer
owns that token range once D has fully joined the cluster. It avoided previous issues where
it could stream from A but  B later on is the one that no longer owns that token range (gives
up its range ownership to new node D) and if A never had the data then you have kind of lost
what you had in B as B no longer owns that token range. Hence the fix Cassandra-2434 helps
with consistency by ensuring that node used for streaming data (A) is the one that no longer
owns the data so the new node (D) along with other remaining replicas (B & C) should now
give you same consistency as what you had before D joined the cluster.

Replacing a dead node is different in the sense that node from which replacing node will stream
data will also continue to remain data owner. So let’s say you had A,B,C nodes, C is dead
and you replace C with D. Now D can stream from either A or B but whatever it choose will
also continue to own that token range i.e. after D replaces C , we have now A,B & D instead
of A , B and C (as C is dead).

My understanding is that restriction of single node at a time was applied at cluster expansion
time to avoid the clashes in token selection which only applies at time of extending cluster
by adding new node (not when replacing dead node). This is what CASSANDRA-7069 addresses.

I think in your case, when replacing more than one nodes, in theory doing it serially won’t
overcome the issue which I guess  you are highlighting here, which is, if I have to stream
from A or B how do I cover the case  that A is the one with some right data while B is the
one with some right data. I think streaming will use one source. So whether you do it serially
or multiple at a time you have that risk (IMO). If I were you, I would do it one node at a
time to avoid overloading my cluster and then I would run a repair to ensure any data I might
have missed (because of the source it chose during streaming didn’t had it) I sync that
with repair. Then I would move on to doing same steps with next dead node to be replaced.

Alok Dwivedi
Senior Consultant

From: Fd Habash <>
Reply-To: "" <>
Date: Thursday, 2 May 2019 at 08:26
To: "" <>
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees

Appreciate your response.

As for extending the cluster & keeping the default range movement = true, C* won’t allow
 me to bootstrap multiples nodes, anyway.

But, the question I’m still posing and have not gotten an answer for, is if fix Cassandra-2434
disallows bootstrapping multiple nodes to extend the cluster (which I was able to test in
my lab cluster), why did it allow to bootstrap multiple nodes in the process of replacing
dead nodes (no range calc).

This fix forces a node to boostrap from former owner. Is this still the case also when bootstrapping
when replacing dead node.

Thank you

Sent: Wednesday, May 1, 2019 5:13 PM
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees

The article you mentioned here clearly says  “For new users to Cassandra, the safest way
to add multiple nodes into a cluster is to add them one at a time. Stay tuned as I will be
following up with another post on bootstrapping.”

When extending cluster it is indeed recommended to go slow & serially. Optionally you
can use cassandra.consistent.rangemovement=false but you can run in getting over streamed
data.  Since you’re using release way newer when fixed introduced , I assumed you won’t
see same behavior as described for the version which fix addresses. After adding node , if
you won’t get  consistent data, you query consistency level should be able to pull consistent
data , given you can tolerate bit latency until your repair is complete – if you go by recommendation
i.e. to add one node at a time – you’ll avoid all these nuances .

From: Fd Habash []
Sent: Wednesday, May 01, 2019 3:12 PM
Subject: RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Probably, I needed to be clearer in my inquiry ….

I’m investigating a situation where our diagnostic data is telling us that C* has lost some
of the application data. I mean, getsstables for the data returns zero on all nodes in all

The last pickle article below & Jeff Jirsa had described a situation where bootstrapping
a node to extend the cluster can loose data if this new node bootstraps from a stale SECONDARY
replica (node that was offline > hinted had-off window). This was fixed in cassandra-2434.<>

The article & the Jira above describe bootstrapping when extending a cluster.

I understand replacing a dead node does not involve range movement, but will the above Jira
fix prevent the bootstrap process when a replacing a dead node from using secondary replica?


Thank you

From: Fred Habash<>
Sent: Wednesday, May 1, 2019 6:50 AM
Subject: Re: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

Thank you.

Range movement is one reason this is enforced when adding a new node. But, what about forcing
a consistent bootstrap i.e. bootstrapping from primary owner of the range and not a secondary

How’s consistent bootstrap enforced when replacing a dead node.

Thank you.

On Apr 30, 2019, at 7:40 PM, Alok Dwivedi <<>>
When a new node joins the ring, it needs to own new token ranges. This should be unique to
the new node and we don’t want to end up in a situation where two nodes joining simultaneously
can own same range (and ideally evenly distributed). Cassandra has this 2 minute wait rule
for gossip state to propagate before a node is added.  But this on its does not guarantees
that token ranges can’t overlap. See this ticket for more details<>
To overcome this  issue, the approach was to only allow one node joining at a time.

When you replace a dead node the new token range selection does not applies as the replacing
node just owns the token ranges of the dead node. I think that’s why the restriction of
only replacing one node at a time does not applies in this case.

Alok Dwivedi
Senior Consultant<>

From: Fd Habash <<>>
Reply-To: "<>" <<>>
Date: Wednesday, 1 May 2019 at 06:18
To: "<>" <<>>
Subject: Bootstrapping to Replace a Dead Node vs. Adding a New Node: Consistency Guarantees

Reviewing the documentation &  based on my testing, using C* 2.2.8, I was not able to
extend the cluster by adding multiple nodes simultaneously. I got an error message …

Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement
is true

I understand this is to force a node to bootstrap from the former owner of the range when
adding a node as part of extending the cluster.

However, I was able to bootstrap multiple nodes to replace dead nodes. C* did not complain
about it.

Is consistent range movement & the guarantee it offers to bootstrap from primary range
owner not applicable when bootstrapping to replace dead nodes?

Thank you

View raw message