cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: what is best way to do load balance if cluster has Imbalance load
Date Sun, 12 Sep 2010 03:03:29 GMT
That script is for use with the random partitioner, when using the order preserving partitioner
the tokens have to be manually selected to best distribute the load.

See the section on Token Selection here


On 12 Sep 2010, at 04:05, March, Andres wrote:

> It should be simpler than that if the wiki is correct.  Search the list archives or wiki
for the small script that calculates what the token should be for each node.  After that it
should be as simple as using nodetool move on one node at a time.
> On Sep 11, 2010, at 2:00 AM, aaron morton wrote:
>> See the section on Moving or Removing nodes here
you should also read the bootstrapping section as that is essentially what you are doing.
>> AFAIK you should use nodetool move to assign a new token to the nodes. As you say,
loadbalance is not recommended. 
>> The token for the node represents the end of it's token range, i.e. it's responsible
for the data from the previous nodes token to it's. My guess of the best approach would be...
>> I'm think most of your keys are in the range managed by the 15 node. As the 128 node
has the same load and a much smaller key range. Use get_range_slices to have a look at the
keys in your db, or use your knowledge of the keys you are generating. You need to understand
if you are making lots of keys that start with "aaaa". It may help if you provide some more
info on the keys.
>> If it is true that all the keys fall between wpt0w4Aomuhb8MQh and jnGTn7PwLTh6dxmC
I would move  the 155 node to have a token that is about two thirds of the keys between those
tokens. Then move the 239 node to have a token that is one third of the keys. 
>> I'd let each node move complete first, watch the streams to see when it's done. Then
when finished and everything is working, run nodetool cleanup on each node. 
>> I've not actually done this before, I just wanted to think about the problem :) So
I'd also wait for one of the adults around here to weigh in. 
>> Aaron
>> On 11 Sep 2010, at 19:53, maneela a wrote:
>>> we have a Cassandra set up running with 4 nodes with Reflicationfactor:2 and
OrderPreservingPartitioner as partitioner but we have not provided InitialToken values.
>>> Could some one suggest me what is best way to balance my cluster because some
of user threads have suggested "do not ever run nodetool loadbalance" option. which node I
should start running loadbalance command first before doing on 2nd node if that option suits
for my scenario. 
>>> root@ip-10-251-190-239:/etc/cassandra# nodetool -h localhost ring
>>> Address       Status     Load          Range                                
>>>                                        wpt0w4Aomuhb8MQh                     
>>>  Up         119.4 GB      jnGTn7PwLTh6dxmC                     
>>> Up         119.82 GB     kopMmFKwbk1yZFNX                     
     |   |
>>>         2.56 KB       v8w434UBnDIJyrIe                     
     |   |
>>>         2.56 KB       wpt0w4Aomuhb8MQh                     
>>> root@ip-10-251-190-239:/etc/cassandra# 

View raw message