incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <>
Subject Re: RFC: Cassandra Virtual Nodes
Date Mon, 19 Mar 2012 20:20:41 GMT
On Mon, Mar 19, 2012 at 4:15 PM, Sam Overton <> wrote:
> On 19 March 2012 09:23, Radim Kolar <> wrote:
>>> Hi Radim,
>>> The number of virtual nodes for each host would be configurable by the
>>> user, in much the same way that initial_token is configurable now. A host
>>> taking a larger number of virtual nodes (tokens) would have
>>> proportionately
>>> more of the data. This is how we anticipate support for heterogeneity in
>>> cluster hardware.
>> Yes, but this is good only for random partitioner. For ordered you need to
>> be able split token space on highly loaded servers. With virtual tokens it
>> will move load to random node.
>> What if random node will be also hotspot node? Administration will be more
>> difficult because you don't know where workload lands after you reduce
>> number of tokens held by node.
> For OPP we envisage an external management process performing active
> load balancing. The initial token assignment would be random within
> some user-specified range corresponding to the range of their keys.
> The load would then be monitored and hot-spots would be moved by
> reassigning virtual nodes to lightly loaded machines, or introducing
> new tokens into hot ranges. It makes sense that this would not be a
> manual process, but there would certainly be more control than just
> increasing or decreasing the number of tokens assigned to a node.
> --
> Sam Overton
> Acunu | | @acunu

For OPP the problem of load balancing is more profound. Now you need
vnodes per keyspace because you can not expect each keyspace to have
the same distribution. With three keyspaces you are not unsure as to
which was is causing the hotness. I think OPP should just go away.

View raw message