incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sankalp kohli <kohlisank...@gmail.com>
Subject Re: Safely adding new nodes without losing data
Date Sat, 20 Jul 2013 21:21:21 GMT
Interesting... I guess you have to add one node at a time and run repair on
it.


On Sat, Jul 20, 2013 at 7:30 AM, E S <tr1sklion@yahoo.com> wrote:

> I am trying to understand the best procedure for adding new nodes.  The
> one that I see most often online seems to have a hole where there is a low
> probability of permanently losing data.  I want to understand what I am
> missing in my understanding.
>
> Let's say I have a 3 node cluster (node A,B,C) with a RF of 3.  I want to
> double the cluster size to 6 (node A,B,C,D,E,F) while keeping the
> replication factor of 3.  Let's assume we use vnodes.
>
> My understanding is to bootstrap the 3 nodes and then run repair then
> cleanup.  Here is my failure case:
>
> Before bootstrapping I have a row that is only replicated onto node A and
> B.  Assume I did a quorum write and there was some hiccup on C, hinted
> handoff didn't work, and a repair has not yet been run.  Let's also assume
> that once nodes D,E, F have been bootstrapped, this rows new replicas are
> D,E, and F.
>
> My reading through the bootstrapping code shows that for a given range, it
> streams it only from one node (unlike repair).  There is a 1/9 chance that
> D,E,F will have streamed the range containing the row from C, which does
> not have this row.
>
> Now not even a consistency level read of ALL will return the row.  A
> repair will not solve it, and when cleanup is run, the row is permanently
> deleted.
>
> I don't think this problem would normally happen without vnodes, because
> when doubling you would alternate the new nodes with the old nodes in the
> ring, so while quorum might not work until the final repair, "all" would,
> and a repair would solve the problem.  With vnodes though, some of the
> ranges will follow the pattern above (range ownership moving from A,B,C to
> D,E,F).
>
> Am I missing something here?  If I'm right, I think the only way to avoid
> this is adding less then a quorum of new nodes (in this case 1) before
> doing a repair.  That would be painful since repairs take a while.
>
> Thanks for any help.
>
> Eddie
>
>
>

Mime
View raw message