Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B26E310EED for ; Thu, 19 Sep 2013 21:19:25 +0000 (UTC) Received: (qmail 39836 invoked by uid 500); 19 Sep 2013 21:19:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 39575 invoked by uid 500); 19 Sep 2013 21:19:22 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 39567 invoked by uid 99); 19 Sep 2013 21:19:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Sep 2013 21:19:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mohitanchlia@gmail.com designates 209.85.223.179 as permitted sender) Received: from [209.85.223.179] (HELO mail-ie0-f179.google.com) (209.85.223.179) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Sep 2013 21:19:18 +0000 Received: by mail-ie0-f179.google.com with SMTP id e14so16949991iej.10 for ; Thu, 19 Sep 2013 14:18:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=GMJar4EUAdeow95xijP23uLTOsPHwwzOy3TsygPkhU4=; b=MqH4ZAMxYsHQuPy5tibxLDxDeMn2BcRrd6ObBey1N4IjgR8mW4QRE9fYeIOmZTH1pY K76ZdOQbZXztOCw7rGnPHQymyOdH++GoN23mbJzvG6BIP6djnqDRpnO6fKnWQbX36qHM I5oxV3jq03u2+FLKeT58KnE5CXrctU3LOTRh0ZerqaGeqllMF/jY7JT/TIsesv1ll71D FFguG+BQ25/NnrFCdsKnhRTDWLUUuH0zuN6b6oEi/KsAjwfUZdWB8AMd38+aEiE9niCP o1yKB2YYQEibt2EYpvzxsC4GanjeIcSKmr+WEJZE8SgBcE1FxXSKVfs+K+UTDas4ROOf Pmlw== MIME-Version: 1.0 X-Received: by 10.50.2.67 with SMTP id 3mr2683700igs.41.1379625537901; Thu, 19 Sep 2013 14:18:57 -0700 (PDT) Received: by 10.64.103.132 with HTTP; Thu, 19 Sep 2013 14:18:57 -0700 (PDT) In-Reply-To: References: <45B81A29-6E63-471D-9C50-D0CC98ACA035@generalsentiment.com> Date: Thu, 19 Sep 2013 14:18:57 -0700 Message-ID: Subject: Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced. From: Mohit Anchlia To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e0122f7e01c2ef104e6c31aca X-Virus-Checked: Checked by ClamAV on apache.org --089e0122f7e01c2ef104e6c31aca Content-Type: text/plain; charset=ISO-8859-1 Can you run nodetool repair on all the nodes first and look at the keys? On Thu, Sep 19, 2013 at 1:22 PM, Suruchi Deodhar < suruchi.deodhar@generalsentiment.com> wrote: > Yes, the key distribution does vary across the nodes. For example, on the > node with the highest data, Number of Keys (estimate) is 6527744 for a > particular column family, whereas for the same column family on the node > with least data, Number of Keys (estimate) = 3840. > > Is there a way to control this distribution by setting some parameter of > cassandra. > > I am using the Murmur3 partitioner with NetworkTopologyStrategy. > > Thanks, > Suruchi > > > > On Thu, Sep 19, 2013 at 3:59 PM, Mohit Anchlia wrote: > >> Can you check cfstats to see number of keys per node? >> >> >> On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar < >> suruchi.deodhar@generalsentiment.com> wrote: >> >>> Thanks for your replies. I wiped out my data from the cluster and also >>> cleared the commitlog before restarting it with num_tokens=256. I then >>> uploaded data using sstableloader. >>> >>> However, I am still not able to see a uniform distribution of data >>> across nodes of the clusters. >>> >>> The output of the bin/nodetool -h localhost status commands looks like >>> follows. Some nodes have data as low as 1.12MB while some have as high as >>> 912.57 MB. >>> >>> Datacenter: us-east >>> =================== >>> Status=Up/Down >>> |/ State=Normal/Leaving/Joining/Moving >>> -- Address Load Tokens Owns (effective) Host >>> ID Rack >>> UN 10.238.133.174 856.66 MB 256 8.4% >>> e41d8863-ce37-4d5c-a428-bfacea432a35 1a >>> UN 10.238.133.97 439.02 MB 256 7.7% >>> 1bf42b5e-4aed-4b06-bdb3-65a78823b547 1a >>> UN 10.151.86.146 1.05 GB 256 8.0% >>> 8952645d-4a27-4670-afb2-65061c205734 1a >>> UN 10.138.10.9 912.57 MB 256 8.6% >>> 25ccea82-49d2-43d9-830c-b9c9cee026ec 1a >>> UN 10.87.87.240 70.85 MB 256 8.6% >>> ea066827-83bc-458c-83e8-bd15b7fc783c 1b >>> UN 10.93.5.157 60.56 MB 256 7.6% >>> 4ab9111c-39b4-4d15-9401-359d9d853c16 1b >>> UN 10.92.231.170 866.73 MB 256 9.3% >>> a18ce761-88a0-4407-bbd1-c867c4fecd1f 1b >>> UN 10.238.137.250 533.77 MB 256 7.8% >>> 84301648-afff-4f06-aa0b-4be421e0d08f 1a >>> UN 10.93.91.139 478.45 KB 256 8.1% >>> 682dd848-7c7f-4ddb-a960-119cf6491aa1 1b >>> UN 10.138.2.20 1.12 MB 256 7.9% >>> a6d4672a-0915-4c64-ba47-9f190abbf951 1a >>> UN 10.93.31.44 282.65 MB 256 7.8% >>> 67a6c0a6-e89f-4f3e-b996-cdded1b94faf 1b >>> UN 10.236.138.169 223.66 MB 256 9.1% >>> cbbf27b0-b53a-4530-bfdf-3764730b89d8 1a >>> UN 10.137.7.90 11.36 MB 256 7.4% >>> 17b79aa7-64fc-4e16-b96a-955b0aae9bb4 1a >>> UN 10.93.77.166 837.64 MB 256 8.8% >>> 9a821d1e-40e5-445d-b6b7-3cdd58bdb8cb 1b >>> UN 10.120.249.140 838.59 MB 256 9.4% >>> e1fb69b0-8e66-4deb-9e72-f901d7a14e8a 1b >>> UN 10.90.246.128 216.75 MB 256 8.4% >>> 054911ec-969d-43d9-aea1-db445706e4d2 1b >>> UN 10.123.95.248 147.1 MB 256 7.2% >>> a17deca1-9644-4520-9e62-ac66fc6fef60 1b >>> UN 10.136.11.40 4.24 MB 256 8.5% >>> 66be1173-b822-40b5-b650-cb38ae3c7a51 1a >>> UN 10.87.90.42 11.56 MB 256 8.0% >>> dac0c6ea-56c6-44da-a4ec-6388f39ecba1 1b >>> UN 10.87.75.147 549 MB 256 8.3% >>> ac060edf-dc48-44cf-a1b5-83c7a465f3c8 1b >>> UN 10.151.49.88 119.86 MB 256 8.9% >>> 57043573-ab1b-4e3c-8044-58376f7ce08f 1a >>> UN 10.87.83.107 484.3 MB 256 8.3% >>> 0019439b-9f8a-4965-91b8-7108bbb55593 1b >>> UN 10.137.20.183 137.67 MB 256 8.4% >>> 15951592-8ab2-473d-920a-da6e9d99507d 1a >>> UN 10.238.170.159 49.17 MB 256 9.4% >>> 32ce322e-4f7c-46c7-a8ce-bd73cdd54684 1a >>> >>> Is there something else that I should be doing differently? >>> >>> Thanks for your help! >>> >>> Suruchi >>> >>> >>> >>> On Thu, Sep 19, 2013 at 3:20 PM, Richard Low wrote: >>> >>>> The only thing you need to guarantee is that Cassandra doesn't start >>>> with num_tokens=1 (the default in 1.2.x) or, if it does, that you wipe all >>>> the data before starting it with higher num_tokens. >>>> >>>> >>>> On 19 September 2013 19:07, Robert Coli wrote: >>>> >>>>> On Thu, Sep 19, 2013 at 10:59 AM, Suruchi Deodhar < >>>>> suruchi.deodhar@generalsentiment.com> wrote: >>>>> >>>>>> Do you suggest I should try with some other installation mechanism? >>>>>> Are there any known problems with the tar installation of cassandra 1.2.9 >>>>>> that I should be aware of? >>>>>> >>>>> >>>>> I was asking in the context of this JIRA : >>>>> >>>>> https://issues.apache.org/jira/browse/CASSANDRA-2356 >>>>> >>>>> Which does not seem to apply in your case! >>>>> >>>>> =Rob >>>>> >>>> >>>> >>> >> > --089e0122f7e01c2ef104e6c31aca Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Can you run nodetool repair on all the nodes first and look at the keys?
On Thu, Sep 19, 2013 at 1:22 PM, Suruchi De= odhar <suruchi.deodhar@generalsentiment.com> wrote:
Yes, the key distribution does vary = across the nodes. For example, on the node with the highest data, Number of= Keys (estimate) is 6527744 for a particular column family, whereas for the= same column family on the node with least data, Number of Keys (estimate) = =3D 3840.

Is there a way to control this distribution by setting some = parameter of cassandra.

I am using the Murmur3= partitioner with NetworkTopologyStrategy.

Thanks,
Suruchi



On Thu, Sep 19, 2013 at 3:59 PM, Mohit Anchlia <= mohitanchlia@gm= ail.com> wrote:
Can you check cfstats to see number of keys per node?


On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar <suru= chi.deodhar@generalsentiment.com> wrote:
Thanks for your repli= es. I wiped out my data from the cluster and also cleared the commitlog bef= ore restarting it with num_tokens=3D256. I then uploaded data using sstable= loader.

However, I am still not able to see a uniform distribution of data acro= ss nodes of the clusters.

The output of the bin/nodetool -h localhost status commands looks= like follows. Some nodes have data as low as 1.12MB while some have as hig= h as 912.57 MB.

Datacenter: us-east
=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Status=3DUp/Down
|/ State=3DNormal/Leaving/Joining/Moving
--=A0 Address=A0=A0=A0=A0=A0=A0= =A0=A0 Load=A0=A0=A0=A0=A0=A0 Tokens=A0 Owns (effective)=A0 Host ID=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 Rack
UN=A0 10.238.133.174=A0 856.66 MB=A0 256=A0=A0=A0=A0 8.4%= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 e41d8863-ce37-4d5c-a428-bfacea432a3= 5=A0 1a
UN=A0 10.238.133.97=A0=A0 439.02 MB=A0 256=A0=A0=A0=A0 7.7%=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0 1bf42b5e-4aed-4b06-bdb3-65a78823b547=A0 1a
UN= =A0 10.151.86.146=A0=A0 1.05 GB=A0=A0=A0 256=A0=A0=A0=A0 8.0%=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 8952645d-4a27-4670-afb2-65061c205734=A0 1a
U= N=A0 10.138.10.9=A0=A0=A0=A0 912.57 MB=A0 256=A0=A0=A0=A0 8.6%=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 25ccea82-49d2-43d9-830c-b9c9cee026ec=A0 1a
UN=A0 10.87.87.240=A0=A0=A0 70.85 MB=A0=A0 256=A0=A0=A0=A0 8.6%=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 ea066827-83bc-458c-83e8-bd15b7fc783c=A0 1b
U= N=A0 10.93.5.157=A0=A0=A0=A0 60.56 MB=A0=A0 256=A0=A0=A0=A0 7.6%=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 4ab9111c-39b4-4d15-9401-359d9d853c16=A0 1bUN=A0 10.92.231.170=A0=A0 866.73 MB=A0 256=A0=A0=A0=A0 9.3%=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 a18ce761-88a0-4407-bbd1-c867c4fecd1f=A0 1b
UN=A0 10.238.137.250=A0 533.77 MB=A0 256=A0=A0=A0=A0 7.8%=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 84301648-afff-4f06-aa0b-4be421e0d08f=A0 1a
UN=A0 1= 0.93.91.139=A0=A0=A0 478.45 KB=A0 256=A0=A0=A0=A0 8.1%=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0 682dd848-7c7f-4ddb-a960-119cf6491aa1=A0 1b
UN=A0 10.1= 38.2.20=A0=A0=A0=A0 1.12 MB=A0=A0=A0 256=A0=A0=A0=A0 7.9%=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 a6d4672a-0915-4c64-ba47-9f190abbf951=A0 1a
UN=A0 10.93.31.44=A0=A0=A0=A0 282.65 MB=A0 256=A0=A0=A0=A0 7.8%=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 67a6c0a6-e89f-4f3e-b996-cdded1b94faf=A0 1b
U= N=A0 10.236.138.169=A0 223.66 MB=A0 256=A0=A0=A0=A0 9.1%=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 cbbf27b0-b53a-4530-bfdf-3764730b89d8=A0 1a
UN=A0 1= 0.137.7.90=A0=A0=A0=A0 11.36 MB=A0=A0 256=A0=A0=A0=A0 7.4%=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0 17b79aa7-64fc-4e16-b96a-955b0aae9bb4=A0 1a
UN=A0 10.93.77.166=A0=A0=A0 837.64 MB=A0 256=A0=A0=A0=A0 8.8%=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 9a821d1e-40e5-445d-b6b7-3cdd58bdb8cb=A0 1b
U= N=A0 10.120.249.140=A0 838.59 MB=A0 256=A0=A0=A0=A0 9.4%=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 e1fb69b0-8e66-4deb-9e72-f901d7a14e8a=A0 1b
UN=A0 1= 0.90.246.128=A0=A0 216.75 MB=A0 256=A0=A0=A0=A0 8.4%=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0 054911ec-969d-43d9-aea1-db445706e4d2=A0 1b
UN=A0 10.123.95.248=A0=A0 147.1 MB=A0=A0 256=A0=A0=A0=A0 7.2%=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 a17deca1-9644-4520-9e62-ac66fc6fef60=A0 1b
U= N=A0 10.136.11.40=A0=A0=A0 4.24 MB=A0=A0=A0 256=A0=A0=A0=A0 8.5%=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 66be1173-b822-40b5-b650-cb38ae3c7a51=A0 1aUN=A0 10.87.90.42=A0=A0=A0=A0 11.56 MB=A0=A0 256=A0=A0=A0=A0 8.0%=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 dac0c6ea-56c6-44da-a4ec-6388f39ecba1=A0 1= b
UN=A0 10.87.75.147=A0=A0=A0 549 MB=A0=A0=A0=A0 256=A0=A0=A0=A0 8.3%=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ac060edf-dc48-44cf-a1b5-83c7a465f3c8=A0 1= b
UN=A0 10.151.49.88=A0=A0=A0 119.86 MB=A0 256=A0=A0=A0=A0 8.9%=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 57043573-ab1b-4e3c-8044-58376f7ce08f=A0 1aUN=A0 10.87.83.107=A0=A0=A0 484.3 MB=A0=A0 256=A0=A0=A0=A0 8.3%=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0019439b-9f8a-4965-91b8-7108bbb55593=A0 1b UN=A0 10.137.20.183=A0=A0 137.67 MB=A0 256=A0=A0=A0=A0 8.4%=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0 15951592-8ab2-473d-920a-da6e9d99507d=A0 1a
UN= =A0 10.238.170.159=A0 49.17 MB=A0=A0 256=A0=A0=A0=A0 9.4%=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 32ce322e-4f7c-46c7-a8ce-bd73cdd54684=A0 1a

Is there something else that I should be doing differently?

Thanks for your help!

Suruchi

<= /div>


On Thu, = Sep 19, 2013 at 3:20 PM, Richard Low <richard@wentnet.com>= wrote:
The only thing you need to guarantee is t= hat Cassandra doesn't start with num_tokens=3D1 (the default in 1.2.x) = or, if it does, that you wipe all the data before starting it with higher n= um_tokens.


On 19 September 2013 19:07, Robert Coli = <rcoli@eventbrite.com> wrote:
On Thu, Sep 19, 2013 at 10:59 AM, Suruchi Deodhar <suruchi.deodhar@generalsentiment.com> wrote= :
Do you suggest I should try with some other installation mechanism? Ar= e there any known problems with the tar installation of cassandra 1.2.9 tha= t I should be aware of?=A0

I was asking in the cont= ext of this JIRA :


Which does not seem to apply in your case!
=3DRob=A0





--089e0122f7e01c2ef104e6c31aca--