cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "March, Andres" <>
Subject Re: what is best way to do load balance if cluster has Imbalance load
Date Sat, 11 Sep 2010 16:05:48 GMT
It should be simpler than that if the wiki is correct.  Search the list archives or wiki for
the small script that calculates what the token should be for each node.  After that it should
be as simple as using nodetool move on one node at a time.

On Sep 11, 2010, at 2:00 AM, aaron morton wrote:

See the section on Moving or Removing nodes here
you should also read the bootstrapping section as that is essentially what you are doing.

AFAIK you should use nodetool move to assign a new token to the nodes. As you say, loadbalance
is not recommended.

The token for the node represents the end of it's token range, i.e. it's responsible for the
data from the previous nodes token to it's. My guess of the best approach would be...

I'm think most of your keys are in the range managed by the 15 node. As the 128 node has the
same load and a much smaller key range. Use get_range_slices to have a look at the keys in
your db, or use your knowledge of the keys you are generating. You need to understand if you
are making lots of keys that start with "aaaa". It may help if you provide some more info
on the keys.

If it is true that all the keys fall between wpt0w4Aomuhb8MQh and jnGTn7PwLTh6dxmC I would
move  the 155 node to have a token that is about two thirds of the keys between those tokens.
Then move the 239 node to have a token that is one third of the keys.

I'd let each node move complete first, watch the streams to see when it's done. Then when
finished and everything is working, run nodetool cleanup on each node.

I've not actually done this before, I just wanted to think about the problem :) So I'd also
wait for one of the adults around here to weigh in.


On 11 Sep 2010, at 19:53, maneela a wrote:

we have a Cassandra set up running with 4 nodes with Reflicationfactor:2 and OrderPreservingPartitioner
as partitioner but we have not provided InitialToken values.

Could some one suggest me what is best way to balance my cluster because some of user threads
have suggested "do not ever run nodetool loadbalance" option. which node I should start running
loadbalance command first before doing on 2nd node if that option suits for my scenario.

root@ip-10-251-190-239:/etc/cassandra# nodetool -h localhost ring
Address       Status     Load          Range                                      Ring
                                       wpt0w4Aomuhb8MQh  Up         119.4 GB      jnGTn7PwLTh6dxmC                           |<--| Up         119.82 GB     kopMmFKwbk1yZFNX                           |   |         2.56 KB       v8w434UBnDIJyrIe                           |   |         2.56 KB       wpt0w4Aomuhb8MQh                           |-->|

View raw message