Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 58927 invoked from network); 26 Mar 2010 00:00:32 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Mar 2010 00:00:32 -0000 Received: (qmail 12265 invoked by uid 500); 26 Mar 2010 00:00:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 12240 invoked by uid 500); 26 Mar 2010 00:00:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 12232 invoked by uid 99); 26 Mar 2010 00:00:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Mar 2010 00:00:31 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [207.5.72.226] (HELO EXHUB016-3.exch016.msoutlookonline.net) (207.5.72.226) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Mar 2010 00:00:22 +0000 Received: from EXVMBX016-3.exch016.msoutlookonline.net ([207.5.72.173]) by EXHUB016-3.exch016.msoutlookonline.net ([207.5.72.226]) with mapi; Thu, 25 Mar 2010 16:59:48 -0700 From: Daniel Kluesing To: "user@cassandra.apache.org" Date: Thu, 25 Mar 2010 16:59:44 -0700 Subject: RE: Ring management and load balance Thread-Topic: Ring management and load balance Thread-Index: AcrMYhQV2PQ9OOkrSqGER1Z5fKJspgAC2SDQ Message-ID: <33FDEB0CE2F65F41A4CF8769247BB3668DC57B27BF@EXVMBX016-3.exch016.msoutlookonline.net> References: <33FDEB0CE2F65F41A4CF8769247BB3668DC58A2A5C@EXVMBX016-3.exch016.msoutlookonline.net> <10e230a81003251117n681650bas7877aeb7170b6c7a@mail.gmail.com> <2545a92c1003251152h1be16180yc93e649a5ea8a91@mail.gmail.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org I agree it's only a problem with 'small' clusters - but it seems like 'smal= l' is 'most users'? Even with 10 nodes it looks like a pretty big imbalance= if I add an 11th node, and don't add the other 9 or move a large part of t= he ring. Or in practice have folks not had trouble with incremental scalabi= lity? -----Original Message----- From: Jonathan Ellis [mailto:jbellis@gmail.com]=20 Sent: Thursday, March 25, 2010 2:27 PM To: user@cassandra.apache.org Subject: Re: Ring management and load balance One problem is if the heaviest node is next to a node that's is lighter than average, instead of heavier. Then if the new node takes extra from the heaviest, say 75% instead of just 1/2, and then we take 1/2 of the heaviest's neighbor and put it on the heaviest, you made that lighter-than-average node even lighter. Could you move 1/2, 1/4, etc. only until you get to a node lighter than average? Probably. But I'm not sure if it's a big enough win to justify the the complexity. Probably a better solution would be a tool where you tell it "I want to add N nodes to my cluster, analyzes the load factors and tell me what tokens to add them with, and what additional moves to make to get me within M% of equal loads, with the minimum amount of data movement." -Jonathan On Thu, Mar 25, 2010 at 1:52 PM, Jeremy Dunck wrote: > On Thu, Mar 25, 2010 at 1:26 PM, Jonathan Ellis wrote= : >> Pretty much everything assumes that there is a 1:1 correspondence >> between IP and Token. =A0It's probably in the ballpark of "one month to >> code, two to get the bugs out." =A0Gossip is one of the trickier parts >> of our code base, and this would be all over that. =A0The actual storage >> system changes would be simpler I think. > > What if adding a node shifted down-ring tokens less and less? =A0If > adding node N+1, it shifts the first N/2^x, the second N/2^2x, the > third N/2^3x, etc, so that a fixed number of nodes are shifted, but > the bump is smoothed out? =A0Tokens stay 1:1. > > I'm talking out of my league here -- haven't actually run a cluster > yet -- so probably a dumb idea. =A0:-) >