cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: what is best way to do load balance if cluster has Imbalance load
Date Sat, 11 Sep 2010 09:00:55 GMT
See the section on Moving or Removing nodes here http://wiki.apache.org/cassandra/Operations
you should also read the bootstrapping section as that is essentially what you are doing.

AFAIK you should use nodetool move to assign a new token to the nodes. As you say, loadbalance
is not recommended. 

The token for the node represents the end of it's token range, i.e. it's responsible for the
data from the previous nodes token to it's. My guess of the best approach would be...

I'm think most of your keys are in the range managed by the 15 node. As the 128 node has the
same load and a much smaller key range. Use get_range_slices to have a look at the keys in
your db, or use your knowledge of the keys you are generating. You need to understand if you
are making lots of keys that start with "aaaa". It may help if you provide some more info
on the keys.

If it is true that all the keys fall between wpt0w4Aomuhb8MQh and jnGTn7PwLTh6dxmC I would
move  the 155 node to have a token that is about two thirds of the keys between those tokens.
Then move the 239 node to have a token that is one third of the keys. 

I'd let each node move complete first, watch the streams to see when it's done. Then when
finished and everything is working, run nodetool cleanup on each node. 

I've not actually done this before, I just wanted to think about the problem :) So I'd also
wait for one of the adults around here to weigh in. 

Aaron



On 11 Sep 2010, at 19:53, maneela a wrote:

> we have a Cassandra set up running with 4 nodes with Reflicationfactor:2 and OrderPreservingPartitioner
as partitioner but we have not provided InitialToken values.
> 
> Could some one suggest me what is best way to balance my cluster because some of user
threads have suggested "do not ever run nodetool loadbalance" option. which node I should
start running loadbalance command first before doing on 2nd node if that option suits for
my scenario. 
> 
> 
> root@ip-10-251-190-239:/etc/cassandra# nodetool -h localhost ring
> Address       Status     Load          Range                                      Ring
>                                        wpt0w4Aomuhb8MQh                           
> 10.202.87.15  Up         119.4 GB      jnGTn7PwLTh6dxmC                           |<--|
> 10.223.71.128 Up         119.82 GB     kopMmFKwbk1yZFNX                           | 
 |
> 10.251.190.239Up         2.56 KB       v8w434UBnDIJyrIe                           | 
 |
> 10.201.217.155Up         2.56 KB       wpt0w4Aomuhb8MQh                           |-->|
> root@ip-10-251-190-239:/etc/cassandra# 
> 
> 


Mime
View raw message