incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Golick <jamesgol...@gmail.com>
Subject Re: Uneven distribution using RP
Date Tue, 22 Jun 2010 17:07:45 GMT
This node's load is now growing at a ridiculous rate. It is at 105GB, with
the next most loaded node at 70.63GB.

Given that RF=3, I would assume that the replicas' nodes would grow
relatively quickly too?

On Mon, Jun 21, 2010 at 6:44 AM, aaron morton <aaron@thelastpickle.com>wrote:

> According to http://wiki.apache.org/cassandra/Operations nodetool repair
> is used to perform a major compaction and compare data between the nodes,
> repairing any conflicts. Not sure that would improve the load balance,
> though it may reduce some wasted space on the nodes.
>
> nodetool loadbalance will remove the node from the ring after streaming
> it's data to the remaining nodes and the add it back in the busiest part.
> I've used it before and it seems to do the trick.
>
> Also consider the size of the rows. Are they generally similar or do you
> have some that are much bigger? The keys will be distributed without
> considering the size of the data.
>
> The RP is random though, i do not think it tries to evenly distribute the
> keys. So some variance with a small number of nodes should be expected IMHO.
>
> Aaron
>
> On 21 Jun 2010, at 02:31, James Golick wrote:
>
> I ran cleanup on all of them and the distribution looked roughly even after
> that, but a couple of days later, it's looking pretty uneven.
>
> On Sun, Jun 20, 2010 at 10:21 AM, Jordan Pittier - Rezel <jordan@rezel.net
> > wrote:
>
>> Hi,
>> Have you tried nodetool repair (or cleanup) on your nodes ?
>>
>>
>> On Sun, Jun 20, 2010 at 4:16 PM, James Golick <jamesgolick@gmail.com>wrote:
>>
>>> I just increased my cluster from 2 to 4 nodes, and RF=2 to RF=3, using
>>> RP.
>>>
>>> The tokens seem pretty even on the ring, but two of the nodes are far
>>> more heavily loaded than the others. I understand that there are a variety
>>> of possible reasons for this, but I'm wondering whether anybody has
>>> suggestions for now to tweak the tokens such that this problem is
>>> alleviated. Would it be better to just add 2 more nodes?
>>>
>>> Address       Status     Load          Range
>>>          Ring
>>>
>>> 170141183460469231731687303715884105728
>>> 10.36.99.140  Up         61.73 GB
>>>  43733172796241720623128947447312912170     |<--|
>>> 10.36.99.134  Up         69.7 GB
>>> 85070591730234615865843651857942052864     |   |
>>> 10.36.99.138  Up         54.08 GB
>>>  128813844387867495544257452469445200073    |   |
>>> 10.36.99.136  Up         54.75 GB
>>>  170141183460469231731687303715884105728    |-->|
>>>
>>>
>>
>>
>
>

Mime
View raw message