cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Zhu <>
Subject Re: Recovering from a faulty cassandra node
Date Tue, 19 Mar 2013 17:06:28 GMT
Hi Dean,
If you are not using VNode and try to replace the node, use the new token as old token -1,
not +1. The reason is that, the assignment of token is clock wise along the ring. If you set
your new token to be old token -1, the new node will take over all the data of the old node
except for one token which was assigned to the old node. If you assign new token to be old
token + 1, then the new node will only streame data of one token. So as a good practice, don't
set 0 as your node token, start with 100. So it's easier to  go down from 100 than go down
from 0 (need to caculate 2 ^ 127 - 1)

Hope I didn't confuse you.


----- Original Message -----
From: "Dean Hiller" <>
Sent: Tuesday, March 19, 2013 8:25:25 AM
Subject: Re: Recovering from a faulty cassandra node

I have not done this as of yet but from all that I have read your best option is to follow
the replace node documentation which I belive you need to

 1.  Have the token be the same BUT add 1 to it so it doesn't think it's the same computer
 2.  Have the bootstrap option set or something so streaming takes affect.

I would however test that all out in QA to make sure it works and if you have QUOROM reads/writes
a good part of that test would be to take node X down after your node Y is back in the cluster
to make sure reads/writes are working on the node you fixed… just need to make sure
node X shares one of the token ranges of node Y AND your writes/reads are in that token range.


From: Jabbar Azam <<>>
Reply-To: "<>" <<>>
Date: Tuesday, March 19, 2013 8:51 AM
To: "<>" <<>>
Subject: Recovering from a faulty cassandra node


I am using Cassandra 1.2.2 on a 4 node test cluster with vnodes. I waited for over a week
to insert lots of data into the cluster. During the end of the process one of the nodes had
a hardware fault.

I have fixed the hardware fault but the filing system on that node is corrupt so I'll have
to reinstall the OS and cassandra.

I can think of two ways of reintegrating the host into the cluster

1) shrink the cluster to three nodes and add the node into the cluster

2) Add the node into the cluster without shrinking

I'm not sure of the best approach to take and I'm not sure how to achieve each step.

Can anybody help?


 Jabbar Azam

View raw message