incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: Recovering from a faulty cassandra node
Date Wed, 20 Mar 2013 12:25:09 GMT
Makes senseŠ.thanks!!! I will note that for our future replacements(we
still have to test a full replacement out).


On 3/19/13 11:06 AM, "Wei Zhu" <> wrote:

>Hi Dean,
>If you are not using VNode and try to replace the node, use the new token
>as old token -1, not +1. The reason is that, the assignment of token is
>clock wise along the ring. If you set your new token to be old token -1,
>the new node will take over all the data of the old node except for one
>token which was assigned to the old node. If you assign new token to be
>old token + 1, then the new node will only streame data of one token. So
>as a good practice, don't set 0 as your node token, start with 100. So
>it's easier to  go down from 100 than go down from 0 (need to caculate 2
>^ 127 - 1)
>Hope I didn't confuse you.
>----- Original Message -----
>From: "Dean Hiller" <>
>Sent: Tuesday, March 19, 2013 8:25:25 AM
>Subject: Re: Recovering from a faulty cassandra node
>I have not done this as of yet but from all that I have read your best
>option is to follow the replace node documentation which I belive you
>need to
> 1.  Have the token be the same BUT add 1 to it so it doesn't think it's
>the same computer
> 2.  Have the bootstrap option set or something so streaming takes affect.
>I would however test that all out in QA to make sure it works and if you
>have QUOROM reads/writes a good part of that test would be to take node X
>down after your node Y is back in the cluster to make sure reads/writes
>are working on the node you fixedŠ just need to make sure node X
>shares one of the token ranges of node Y AND your writes/reads are in
>that token range.
>From: Jabbar Azam <<>>
>Reply-To: "<>"
>Date: Tuesday, March 19, 2013 8:51 AM
>To: "<>"
>Subject: Recovering from a faulty cassandra node
>I am using Cassandra 1.2.2 on a 4 node test cluster with vnodes. I waited
>for over a week to insert lots of data into the cluster. During the end
>of the process one of the nodes had a hardware fault.
>I have fixed the hardware fault but the filing system on that node is
>corrupt so I'll have to reinstall the OS and cassandra.
>I can think of two ways of reintegrating the host into the cluster
>1) shrink the cluster to three nodes and add the node into the cluster
>2) Add the node into the cluster without shrinking
>I'm not sure of the best approach to take and I'm not sure how to achieve
>each step.
>Can anybody help?
> Jabbar Azam

View raw message