Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 91233 invoked from network); 1 Oct 2009 18:04:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Oct 2009 18:04:45 -0000 Received: (qmail 41727 invoked by uid 500); 1 Oct 2009 18:04:44 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 41711 invoked by uid 500); 1 Oct 2009 18:04:44 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 41702 invoked by uid 99); 1 Oct 2009 18:04:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 18:04:44 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jbellis@gmail.com designates 209.85.219.205 as permitted sender) Received: from [209.85.219.205] (HELO mail-ew0-f205.google.com) (209.85.219.205) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 18:04:35 +0000 Received: by ewy1 with SMTP id 1so446394ewy.27 for ; Thu, 01 Oct 2009 11:04:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=w6rsRwGH7BzN7XKDWPtvFs95LY6ibbhpKRiSmn7K1Kk=; b=RHcyTXuqlk1Xd9MievLHmXDJvK3YgC5nUmEWLhk6AGv1akA4aNaN0Fka8eJM35iY1I MnvZziRUP17p9REliXowWyHmgI3Ty7eCMfbeOHki2lOfYkV/vWMPc0pKC0H9UQV/ubOU +TrY3C+hJ014M+fFjramtzmVj30t/riaAY5eo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=fCecicT/w7X9KJdg7nIt9hK0WaDvXq63XaCAzMbV0znG78mh+PY5Ism5lGWKMhL0xv Bhpf4LXKlBWs4boZvMCTfTDG5EhhIyc/4Aa46PGQcOCKr79/Um3cRJUIARh1mG9LyS83 VXEtA1kX6lB871+1XlKfBxvRtqZcLgpWjnQ6I= MIME-Version: 1.0 Received: by 10.216.55.135 with SMTP id k7mr298027wec.13.1254420254493; Thu, 01 Oct 2009 11:04:14 -0700 (PDT) In-Reply-To: <23b1e84e0910011049h76be662djdeff912ed6852ab8@mail.gmail.com> References: <23b1e84e0910010926w65e08b7dke62d6c615e441645@mail.gmail.com> <23b1e84e0910011014s3cad1889na7b6b3d37485789f@mail.gmail.com> <23b1e84e0910011034p3f02aec2y3bf68465fc96cb00@mail.gmail.com> <23b1e84e0910011049h76be662djdeff912ed6852ab8@mail.gmail.com> Date: Thu, 1 Oct 2009 13:04:14 -0500 Message-ID: Subject: Re: distributing tokens equally along the key distribution space From: Jonathan Ellis To: cassandra-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org yes On Thu, Oct 1, 2009 at 12:49 PM, Igor Katkov wrote: > I see, so to make cluster always balanced (data-wise) number of nodes sho= uld > be doubled each time. > I see some activity in JIAR regarding load-balancing for v.0.5 > Does it target the same thing? transferring data from node to node and > appropriately modifying tokens? > > On Thu, Oct 1, 2009 at 1:42 PM, Jonathan Ellis wrote: >> >> You basically have two options. =A0You can wipe your data, change the >> tokens, and reload things, or you can add new nodes with -b to >> rebalance things that way. >> >> On Thu, Oct 1, 2009 at 12:34 PM, Igor Katkov wrote: >> > OK, so I don't need to use tokenupdater, what are the steps to rebalan= ce >> > data around the circle? >> > >> > In my test example (see below), I have A, D, B and C (clockwise) where >> > A holds 1/3 of the data >> > D - 1/6 >> > B - 1/6 >> > C - 1/3 >> > I'm willing to change tokens manually, it's all right. >> > How do I tell all nodes to move data around in version 0.4? Do I chang= e >> > token on node A and restart it with -b? Then same for the rest? >> > restarting >> > only one node at a time? >> > >> > >> > >> > On Thu, Oct 1, 2009 at 1:22 PM, Jonathan Ellis >> > wrote: >> >> >> >> tokenupdater does not move data around; it's just an alternative to >> >> setting on each node. =A0so you really want to get you= r >> >> tokens right for your initial set of nodes before adding data. >> >> >> >> we're finishing up full load balancing for 0.5 but even then it's bes= t >> >> to start with a reasonable distribution instead of starting with >> >> random and forcing the balancer to move things around a bunch. >> >> >> >> On Thu, Oct 1, 2009 at 12:14 PM, Igor Katkov wrot= e: >> >> > What is the correct procedure for data re-partitioning? >> >> > Suppose I have 3 nodes - "A", "B", "C" >> >> > tokens on the ring: >> >> > A: 0 >> >> > B: 2.8356863910078205288614550619314e+37 >> >> > C: 5.6713727820156410577229101238628e+37 >> >> > >> >> > Then I add node "D", token: 1.4178431955039102644307275309655e+37 >> >> > (B/2) >> >> > Start node "D" with -b >> >> > Wait >> >> > Run nodeprobe -host hostB ... cleanup on live "B" >> >> > Wait >> >> > Done >> >> > >> >> > Now data is not evenly balanced because tokens are not evenly space= d. >> >> > I >> >> > see >> >> > that there is tokenupdater (org.apache.cassandra.tools.TokenUpdater= ) >> >> > What happens with keys and data if I run it on "A", "B", "C" and "D= " >> >> > with >> >> > new, better spaced tokens? Should I? is there a better procedure? >> >> > >> >> > >> >> > >> >> > >> >> > On Thu, Oct 1, 2009 at 12:48 PM, Jonathan Ellis >> >> > wrote: >> >> >> >> >> >> On Thu, Oct 1, 2009 at 11:26 AM, Igor Katkov >> >> >> wrote: >> >> >> > Hi, >> >> >> > >> >> >> > Question#1: >> >> >> > How to manually select tokens to force equal spacing of tokens >> >> >> > around >> >> >> > the >> >> >> > hash space? >> >> >> >> >> >> (Answered by Jun.) >> >> >> >> >> >> > Question#2: >> >> >> > Let's assume that #1 was resolved somehow and key distribution i= s >> >> >> > more >> >> >> > or >> >> >> > less even. >> >> >> > A new node "C" joins the cluster. It's token falls somewhere >> >> >> > between >> >> >> > two >> >> >> > other tokens on the ring (from nodes "A" and "B" >> >> >> > clockwise-ordered). >> >> >> > From >> >> >> > now on "C" is responsible for a portion of data that used to >> >> >> > exclusively >> >> >> > belong to "B". >> >> >> > a. Cassandra v.0.4 will not automatically transfer this data to >> >> >> > "C" >> >> >> > will >> >> >> > it? >> >> >> >> >> >> It will, if you start C with the -b ("bootstrap") flag. >> >> >> >> >> >> > b. Do all reads to these keys fail? >> >> >> >> >> >> No. >> >> >> >> >> >> > c. What happens with the data reference by these keys on "B"? It >> >> >> > will >> >> >> > never >> >> >> > be accessed there, therefor it becomes garbage. Since there are = to >> >> >> > GC >> >> >> > will >> >> >> > it stick forever? >> >> >> >> >> >> nodeprobe cleanup after the bootstrap completes will instruct B to >> >> >> throw out data that has been copied to C. >> >> >> >> >> >> > d. What happens to replicas of these keys? >> >> >> >> >> >> These are also handled by -b. >> >> >> >> >> >> -Jonathan >> >> > >> >> > >> > >> > > >