Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C253E8C74 for ; Tue, 16 Aug 2011 18:55:30 +0000 (UTC) Received: (qmail 3406 invoked by uid 500); 16 Aug 2011 18:55:28 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 3346 invoked by uid 500); 16 Aug 2011 18:55:28 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 3338 invoked by uid 99); 16 Aug 2011 18:55:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2011 18:55:27 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 209.85.215.170 as permitted sender) Received: from [209.85.215.170] (HELO mail-ey0-f170.google.com) (209.85.215.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2011 18:55:23 +0000 Received: by eyd10 with SMTP id 10so175615eyd.1 for ; Tue, 16 Aug 2011 11:55:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=RBWYrRGYWabZfWg07lpdIDJxfmdrf8qlT6DD38v7DLI=; b=RYLPzLDEfYAqZIoOgswiEctdHZ+WG3xMS74Ol4Cl1RHkSx3stQx2x40rerQsg4rULn 7YnAk10eAiDksz/96gIhq/jjcoXx+Zo/65/QXvxf8PVE1RoxNIR2aI42wAw8Q4rSh26K UtJpywCiEXSfQuvvW8WLJXvT7mrIQ41QE0waY= Received: by 10.213.113.14 with SMTP id y14mr52094ebp.147.1313520901139; Tue, 16 Aug 2011 11:55:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.213.10.143 with HTTP; Tue, 16 Aug 2011 11:54:35 -0700 (PDT) In-Reply-To: References: From: Jonathan Ellis Date: Tue, 16 Aug 2011 13:54:35 -0500 Message-ID: Subject: Re: Partitioning, tokens, and sequential keys To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable what tokens did you end up using? are you sure it's actually due to different amounts of rows? have you run cleanup and compact to make sure it's not unused data / obsolete replicas taking up the space? On Tue, Aug 16, 2011 at 1:41 PM, David McNelis wrote: > We are currently running a three node cluster where we assigned the initi= al > tokens using the Python script that is in the Wiki, and we're currently > using the Random Partitioner, RF=3D1, Cassandra 0.8 from the Riptano RPM > ....however we're seeing one node taken on over 60% of the data as we loa= d > data. > Our keys are sequential, and can range from 0 to 2^64, though in practice > we're between 1 and 2,000,000,000, with the current =A0max around 50,000.= =A0 In > order to balance out the =A0load would we be best served changing our tok= ens > to make the top and bottom 1/3rd of the node go to the previous and next > nodes respectively, then running nodetool move? > Even if we do that, it would seem that we'd likely continue to run into t= his > sort of issue as =A0we =A0add =A0additionally=A0data... would we be bette= r served > with a different Partitioner strategy? =A0Or will we need to very activel= y > manage our tokens to avoid getting into an unbalanced situation? > > -- > David McNelis > Lead Software Engineer > Agentis Energy > www.agentisenergy.com > o: 630.359.6395 > c: 219.384.5143 > A Smart Grid technology company focused on helping consumers of energy > control an often under-managed resource. > > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com