One problem is if the heaviest node is next to a node that's is
lighter than average, instead of heavier. Then if the new node takes
extra from the heaviest, say 75% instead of just 1/2, and then we take
1/2 of the heaviest's neighbor and put it on the heaviest, you made
that lighterthanaverage node even lighter.
Could you move 1/2, 1/4, etc. only until you get to a node lighter
than average? Probably. But I'm not sure if it's a big enough win to
justify the the complexity.
Probably a better solution would be a tool where you tell it "I want
to add N nodes to my cluster, analyzes the load factors and tell me
what tokens to add them with, and what additional moves to make to get
me within M% of equal loads, with the minimum amount of data
movement."
Jonathan
On Thu, Mar 25, 2010 at 1:52 PM, Jeremy Dunck <jdunck@gmail.com> wrote:
> On Thu, Mar 25, 2010 at 1:26 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> Pretty much everything assumes that there is a 1:1 correspondence
>> between IP and Token. It's probably in the ballpark of "one month to
>> code, two to get the bugs out." Gossip is one of the trickier parts
>> of our code base, and this would be all over that. The actual storage
>> system changes would be simpler I think.
>
> What if adding a node shifted downring tokens less and less? If
> adding node N+1, it shifts the first N/2^x, the second N/2^2x, the
> third N/2^3x, etc, so that a fixed number of nodes are shifted, but
> the bump is smoothed out? Tokens stay 1:1.
>
> I'm talking out of my league here  haven't actually run a cluster
> yet  so probably a dumb idea. :)
>
