cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carl Mueller <>
Subject Re: vnode random token assignment and replicated data antipatterns
Date Tue, 20 Feb 2018 17:33:24 GMT
So in theory, one could double a cluster by:

1) moving snapshots of each node to a new node.
2) for each snapshot moved, figure out the primary range of the new node by
taking the old node's primary range token and calculating the midpoint
value between that and the next primary range start token
3) the RFs should be preserved since the snapshot have a replicated set of
data for the old primary range, the next primary has a RF already, and so
does the n+1 primary range already

data distribution will be the same as the old primary range distirubtion.

Then nodetool clean and repair would get rid of old data ranges not needed

In practice, is this possible? I have heard Priam can double clusters and
they do not use vnodes. I am assuming they do a similar approach but they
only have to calculate single tokens?

On Tue, Feb 20, 2018 at 11:21 AM, Carl Mueller <
> wrote:

> As I understand it: Replicas of data are replicated to the next primary
> range owner.
> As tokens are randomly generated (at least in 2.1.x that I am on), can't
> we have this situation:
> Say we have RF3, but the tokens happen to line up where:
> NodeA handles 0-10
> NodeB handles  11-20
> NodeA handlea 21-30
> NodeB handles 31-40
> NodeC handles 40-50
> The key aspect of that is that the random assignment of primary range
> vnode tokens has resulted in NodeA and NodeB being the primaries for four
> adjacent primary ranges.
> IF RF is replicated by going to the next adjacent nodes in the primary
> range, and we are, say RF3, then B will have a replica of A, and then the
> Is the RF distribution durable to this by ignoring the reappearance of A
> and then cycling through until a unique node (NodeC) is encountered, and
> then that becomes the third replica?

View raw message