incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias L. Jugel" <...@thinkberg.com>
Subject Re: token distribution question
Date Sat, 18 Sep 2010 21:47:12 GMT
Okay, I am answering it myself. As confirmed by benblack on the cassandra irc channel and some
reading, the script he presented is for the RP. My idea is now to write a little script to
count the keys in my main CF and dividing the key ranges to assign to the nodes.

This creates some kind of current snap-shot load balancing, so I will have to adapt a bit
to actually take future changes into accord. We will have a writing hotspot, but that was
to be expected. Our writing is dependent on a few reads anyway, so if we can distribute that
will be more important.

A good tip found here was that the token assigned is the last token the node is responsible
for. I thought I should mention it again.

Please correct me if I am wrong at some point.

Leo.

On 18.09.2010, at 15:53, Matthias L. Jugel wrote:

> Hi,
> 
> in our setup we have four nodes and we are using the OPP. After starting and writing
to the cluster it starts to get unbalanced as one expects. I would like to do some manual
reassignment as we have some information on how the distribution looks like.
> 
> Ben Black's little script seems to do calculate an even distribution considering one
uses RP, am I right?
> 
> To get a bit more precise, I need those keys to balance:
> 
> - a UUID that looks like "TIMESTAMP-SEQUENTIALID"
> 
> The UUID distributes uneven over the years and month as we have only a few sequential
ids early 2009, more later and very many more starting with 2010 and so on. Before I start
playing with load balancing I would like to confirm that it makes sense to take the heuristics
and assign something like this:
> 
> node 1: UUID(0,0) -> UUID(2009-12-31, 0)
> node 2: UUID(2010-01-01, 0) -> UUID(2010-04-30)
> node 3: UUID(2010-05-01, 0) -> UUID(2010-08-31)
> node 4: UUID(2010-09-01, 0) -> UUID(MAX, MAX)
> 
> and adding new nodes with a new range after last node.
> 
> Another question: what influence does the wraparound have in the ring on the partitioning?
> 
> Thanks for your help.
> 
> Leo.


Mime
View raw message