Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CC1C4DFC1 for ; Mon, 10 Sep 2012 18:49:14 +0000 (UTC) Received: (qmail 27497 invoked by uid 500); 10 Sep 2012 18:49:11 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 27481 invoked by uid 500); 10 Sep 2012 18:49:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 27473 invoked by uid 99); 10 Sep 2012 18:49:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 18:49:11 +0000 X-ASF-Spam-Status: No, hits=2.6 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,TRACKER_ID X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tyler@datastax.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vb0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 18:49:06 +0000 Received: by vbbfc26 with SMTP id fc26so1134037vbb.31 for ; Mon, 10 Sep 2012 11:48:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=MCWUShfdUtxxZDZ8kbYHZCjQ11FZLMoQi3UDwU7Glf4=; b=K3hwTWEp50qSXTmbdOxWA0whVY3kvk6oZnymUPZ2wa0BOzsNq4slbxU/HRsPrhr+HT CB4ihudzJaBs0YFh0Bj+xkDMrerrFMe0qWglgNG0uuWqW0bXDbzWq6myfGL2JtbCdfu6 FlEkc5rRIuOmNWQJVp+MG/T62c+vIudOIvlfOy2ULRqf+8mVhIQGsJ4QDwUtMNbVFcKa /Iolnkn41kRx2EECzUPKmXp0i0J6OeGKqrOZNZLUnN5bXGAEQeY6Qbt4BaU4U3XmWIeE lkPqVOaTWx2LrYADVS/tMVIZgbvM3jthr/qwYFIRPw9bC09p233wU6PASpjDQosP4yKO XU5g== MIME-Version: 1.0 Received: by 10.220.150.16 with SMTP id w16mr21070801vcv.65.1347302925731; Mon, 10 Sep 2012 11:48:45 -0700 (PDT) Received: by 10.59.0.102 with HTTP; Mon, 10 Sep 2012 11:48:45 -0700 (PDT) In-Reply-To: <504D9502.1040100@gmail.com> References: <504D9502.1040100@gmail.com> Date: Mon, 10 Sep 2012 13:48:45 -0500 Message-ID: Subject: Re: new "nodetool ring" output and unbalanced ring? From: Tyler Hobbs To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d043890954b05d804c95d6857 X-Gm-Message-State: ALoCoQndrL4/Jf1Oj9Vaaa2BYXh8JtL+fhOiAHGrWhRg5C6+xXM7v3Ia7+0qut4S8ZJthVvDMCKd X-Virus-Checked: Checked by ClamAV on apache.org --f46d043890954b05d804c95d6857 Content-Type: text/plain; charset=ISO-8859-1 It leaves some breathing room for fixing mistakes, adding DCs, etc. The set of data in a 100 token range is basically the same as a 1 token range: nothing, statistically speaking. On Mon, Sep 10, 2012 at 2:21 AM, Guy Incognito wrote: > out of interest, why -100 and not -1 or + 1? any particular reason? > > > On 06/09/2012 19:17, Tyler Hobbs wrote: > > To minimize the impact on the cluster, I would bootstrap a new 1d node at > (42535295865117307932921825928971026432 - 100), then decommission the 1c > node at 42535295865117307932921825928971026432 and run cleanup on your > us-east nodes. > > On Thu, Sep 6, 2012 at 1:11 PM, William Oberman wrote: > >> Didn't notice the racks! Of course.... >> >> If I change a 1c to a 1d, what would I have to do to make sure data >> shuffles around correctly? Repair everywhere? >> >> will >> >> On Thu, Sep 6, 2012 at 2:09 PM, Tyler Hobbs wrote: >> >>> The main issue is that one of your us-east nodes is in rack 1d, while >>> the restart are in rack 1c. With NTS and multiple racks, Cassandra will >>> try use one node from each rack as a replica for a range until it either >>> meets the RF for the DC, or runs out of racks, in which case it just picks >>> nodes sequentially going clockwise around the ring (starting from the range >>> being considered, not the last node that was chosen as a replica). >>> >>> To fix this, you'll either need to make the 1d node a 1c node, or make >>> 42535295865117307932921825928971026432 a 1d node so that you're alternating >>> racks within that DC. >>> >>> >>> On Thu, Sep 6, 2012 at 12:54 PM, William Oberman < >>> oberman@civicscience.com> wrote: >>> >>>> Hi, >>>> >>>> I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and >>>> nodetool -ring seems to have changed from "owns" to "effectively owns". >>>> "Effectively owns" seems to account for replication factor (RF). I'm ok >>>> with all of this, yet I still can't figure out what's up with my cluster. >>>> I have a NetworkTopologyStrategy with two data centers (DCs) with >>>> RF/number nodes in DC combinations of: >>>> DC Name, RF, # in DC >>>> analytics, 1, 2 >>>> us-east, 3, 4 >>>> So I'd expect 50% on each analytics node, and 75% for each us-east >>>> node. Instead, I have two nodes in us-east with 50/100??? (the other two >>>> are 75/75 as expected). >>>> >>>> Here is the output of nodetool (all nodes report the same thing): >>>> Address DC Rack Status State Load >>>> Effective-Ownership Token >>>> >>>> 127605887595351923798765477786913079296 >>>> x.x.x.x us-east 1c Up Normal 94.57 GB 75.00% >>>> 0 >>>> x.x.x.x analytics 1c Up Normal 60.64 GB 50.00% >>>> 1 >>>> x.x.x.x us-east 1c Up Normal 131.76 GB 75.00% >>>> 42535295865117307932921825928971026432 >>>> x.x.x.x us-east 1c Up Normal 43.45 GB >>>> 50.00% 85070591730234615865843651857942052864 >>>> x.x.x.x analytics 1d Up Normal 60.88 GB >>>> 50.00% 85070591730234615865843651857942052865 >>>> x.x.x.x us-east 1d Up Normal 98.56 GB >>>> 100.00% 127605887595351923798765477786913079296 >>>> >>>> If I use cassandra-cli to do "show keyspaces;" I get (and again, all >>>> nodes report the same thing): >>>> Keyspace: civicscience: >>>> Replication Strategy: >>>> org.apache.cassandra.locator.NetworkTopologyStrategy >>>> Durable Writes: true >>>> Options: [analytics:1, us-east:3] >>>> I removed the output about all of my column families (CFs), hopefully >>>> that doesn't matter. >>>> >>>> Did I compute the tokens wrong? Is there a combination of nodetool >>>> commands I can run to migrate the data around to rebalance to 75/75/75/75? >>>> I routinely run repair already. And as the release notes required, I ran >>>> upgradesstables during the upgrade process. >>>> >>>> Before the upgrade, I was getting analytics = 0%, and us-east = 25% >>>> on each node, which I expected for "owns". >>>> >>>> will >>>> >>>> >>> >>> >>> -- >>> Tyler Hobbs >>> DataStax >>> >>> >> >> >> -- >> Will Oberman >> Civic Science, Inc. >> 3030 Penn Avenue., First Floor >> Pittsburgh, PA 15201 >> (M) 412-480-7835 >> (E) oberman@civicscience.com >> > > > > -- > Tyler Hobbs > DataStax > > > -- Tyler Hobbs DataStax --f46d043890954b05d804c95d6857 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It leaves some breathing room for fixing mistakes, adding DCs, etc.=A0 The = set of data in a 100 token range is basically the same as a 1 token range: = nothing, statistically speaking.

On Mon, = Sep 10, 2012 at 2:21 AM, Guy Incognito <dnd1066@gmail.com> w= rote:
=20 =20 =20
out of interest, why -100 and not -1 or + 1?=A0 any particular reason?


On 06/09/2012 19:17, Tyler Hobbs wrote:
To minimize the impact on the cluster, I woul= d bootstrap a new 1d node at (42535295865117307932921825928971026432 - 100), then decommission the 1c node at 42535295865117307932921825928971026432 and run cleanup on your us-east nodes.

On Thu, Sep 6, 2012 at 1:11 PM, William Oberman <oberman@civicscience.com> wrote:
Didn't notice the racks! =A0Of course....

If I change a 1c to a 1d, what would I have to do to make sure data shuffles around correctly? =A0Repair everywhere?

will

On Thu, Sep 6, 2012 at 2:09 PM, Tyler Hobbs <tyler@datastax.com> wrote:
The main issue is that one of your us-east nodes is in rack 1d, while the restart are in rack 1c.=A0 With NTS and multiple racks, Cassandra will try use one node from each rack as a replica for a range until it either meets the RF for the DC, or runs out of racks, in which case it just picks nodes sequentially going clockwise around the ring (starting from the range being considered, not the last node that was chosen as a replica).

To fix this, you'll either need to make the 1d node a 1c node, or make 42535295865117307932921825928971026432 a 1d node so that you're alternating racks within that DC.


On Thu, Sep 6, 2012 at 12:54 PM, William Oberman <<= a href=3D"mailto:oberman@civicscience.com" target=3D"_blank">oberman@civics= cience.com> wrote:
Hi,

I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from "owns" t= o "effectively owns". =A0"Effect= ively owns"=A0seems to account for replication factor (RF). =A0I'm ok with all of this, = yet I still can't figure out what's up wi= th my cluster. =A0I have a=A0NetworkTopologyStrateg= y with two data centers (DCs) with RF/number nodes in DC combinations of:
DC Name, RF, # in DC
analytics, 1, 2
us-east, 3, 4
So I'd expect 50% on each analytics node, and 75% for each us-east node. =A0Instead, I have two nodes in=A0us-east=A0w= ith 50/100??? (the other two are 75/75 as expected).

Here is the output of nodetool (all nodes report the same thing):
Address =A0 =A0 =A0 =A0 DC =A0 =A0 =A0 = =A0 =A0Rack =A0 =A0 =A0 =A0Status State =A0 Load =A0 =A0 =A0 = =A0 =A0 =A0Effective-Ownership Token =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0=A0
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0127605887595351923798765477786913079296 =A0 =A0=A0
x.x.x.x =A0 us-east =A0 =A0 1c =A0 =A0 = =A0 =A0 =A0Up =A0 =A0 Normal =A094.57 GB =A0 =A0 =A0 =A07= 5.00% =A0 =A0 =A0 =A0 =A0 =A0 =A00 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0
x.x.x.x=A0 =A0analytics =A0 1c =A0 =A0 = =A0 =A0 =A0Up =A0 =A0 Normal =A060.64 GB =A0 =A0 =A0 =A05= 0.00% =A0 =A0 =A0 =A0 =A0 =A0 =A01 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0
x.x.x.x=A0 =A0us-east =A0 =A0 1c =A0 =A0= =A0 =A0 =A0Up =A0 =A0 Normal =A0131.76 GB =A0 =A0 =A0 75.= 00% =A0 =A0 =A0 =A0 =A0 =A0 =A042535295865117307932921825928971026432 =A0 =A0 =A0
x.x.x.x=A0 =A0 us-east =A0 =A0 1c =A0 = =A0 =A0 =A0 =A0Up =A0 =A0 Normal =A043.45 GB =A0 =A0 =A0 =A05= 0.00% =A0 =A0 =A0 =A0 =A0 =A0 =A085070591730234615865843651857942052864 =A0 =A0 =A0
x.x.x.x=A0 =A0 analytics =A0 1d =A0 =A0 = =A0 =A0 =A0Up =A0 =A0 Normal =A060.88 GB =A0 =A0 =A0 =A05= 0.00% =A0 =A0 =A0 =A0 =A0 =A0 =A085070591730234615865843651857942052865 =A0 =A0 =A0
x.x.x.x=A0 =A0us-east =A0 =A0 1d =A0 =A0= =A0 =A0 =A0Up =A0 =A0 Normal =A098.56 GB =A0 =A0 =A0 =A01= 00.00% =A0 =A0 =A0 =A0 =A0 =A0 127605887595351923798765477786913079296=A0<= /div>

If I use cassandra-cli to do "show keyspaces;" I get (and again, all nodes report the same thing):
Keyspace: civicscience:
=A0 Replication Strategy: org.apache.cassandra.locator.NetworkTopolog= yStrategy
=A0 Durable Writes: true
=A0 =A0 Options: [analytics:1, us-east:3= ]
I removed the output about all of my column families (CFs), hopefully that doesn't matter.

Did I compute the tokens wrong? =A0Is there a combination of nodetool commands I can run to migrate the data around to rebalance to 75/75/75/75? =A0I routinely run repair already. =A0And as the release notes required, I ran upgradesstables during the upgrade process.

Before the upgrade, I was getting analytics =3D 0%, and us-east =3D 25% on each node, which I expected for "owns".<= /div>

will




--
Tyler Hobbs DataStax




--
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) oberman@civicscience.com



--
Tyler Hobbs
DataStax





--
Tyler Hobbs
DataStax
<= br> --f46d043890954b05d804c95d6857--