incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Overton <>
Subject Re: RFC: Cassandra Virtual Nodes
Date Tue, 20 Mar 2012 14:51:56 GMT
On 20 March 2012 13:37, Eric Evans <> wrote:
> On Tue, Mar 20, 2012 at 6:40 AM, Sam Overton <> wrote:
>> On 20 March 2012 04:35, Vijay <> wrote:
>>> May be, what i mean is little more simple than that... We can consider
>>> every node having a multiple conservative ranges and moving those ranges
>>> for bootstrap etc, instead of finding the mid point etc in the bootstrap
>>> code. Once we get that working all the way to the FS/Streaming then we can
>>> move those ranges and assign those ranges to nodes in random orders. Hope
>>> it makes sense.
>> I agree that this should be approached in incremental steps. Rick
>> already raised concerns about stability issues which might arise from
>> changing large parts of code at once.
>> I would anticipate the first step to be, exactly as you suggest, to
>> support multiple tokens per host instead of just one. Presumably in
>> your suggestion you imagine these tokens to define contiguous ranges
>> for a given host, so that the distribution model is the same as
>> before, but bootstrap can be done incrementally.
>> This would be a great first step. The extension to a virtual node
>> scheme as described previously is then fairly trivial. The only
>> additional change needed is to assign the tokens in some other way
>> which does not restrict the ranges to being contiguous.
> Sounds good to me.
> What can an upgrading user expect in the way of disruption?  What
> would be required to move an existing cluster from one token per node
> to virtual nodes?  Could this be made transparent?

The disruption for an end-user would be no more than the same rolling
upgrade process that they have to go through currently to upgrade to a
new version.

This is how I envisage it working:
* When a node is upgraded and the new version starts up in an old
cluster, it would split its own token range into multiple sub-ranges
by assigning itself more tokens in its own range
* These tokens could then be gossiped to any other new versions in the
cluster. The old versions don't need to know about these intermediate
tokens because distribution is exactly the same - node ranges are
still contiguous
* Once every node has been upgraded, distribution is still the same as
before, but now ranges are split into sub-ranges
* The benefits of vnodes start to become apparent when adding new
nodes to the cluster - a new node bootstrapping would take an even
amount of data from each other node and would not require doubling the
cluster to maintain balance
* As more nodes are added to the cluster it gets closer to full vnode
distribution as more of the original hosts' ranges get reassigned to
new nodes

If the user wants to migrate to full vnode functionality straight away
then they can do a rolling migration (decommission/bootstrap). During
this migration there would be some imbalance in the cluster, but once
all of the old nodes have been migrated, the cluster would be

Sam Overton
Acunu | | @acunu

View raw message