Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CE2259610 for ; Wed, 9 Jan 2013 17:18:56 +0000 (UTC) Received: (qmail 33776 invoked by uid 500); 9 Jan 2013 17:18:54 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 33751 invoked by uid 500); 9 Jan 2013 17:18:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 33743 invoked by uid 99); 9 Jan 2013 17:18:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jan 2013 17:18:54 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of owenzhang1990@gmail.com designates 74.125.83.53 as permitted sender) Received: from [74.125.83.53] (HELO mail-ee0-f53.google.com) (74.125.83.53) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jan 2013 17:18:47 +0000 Received: by mail-ee0-f53.google.com with SMTP id c1so659310eek.40 for ; Wed, 09 Jan 2013 09:18:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=bCYUaZScP+1wGOfsBRYjQHWsDVIkBnBfxWAN23hBDd8=; b=VQMe72JJ7zVR3IUjJEY2tF4STeLpRDPag5MQx6mcDa4Bx6ZfEDAQxVtIfRNMhtykSG 1bamzxbpt84UFEkeRCq/kMBP+EnT5c/dvthWz8Og5rwxPQgKyWMSmq27EMvKakHzVNPu mhb7BUkh3E9M+6VEG53JZGPsXU0KPbUo53X2AkaixmNGqyIGWu6CQYc2oBAhZ8UwMcs3 Hq0sgf5XNQ9TAQGXAin9OIiz9GOJWSPrfxfolAJjcNMXh96IVvObYDo7udABI7xmH1r9 L+2Ezs5F/InM/uL0Fyc6TY/7Ltzrzp6IzacXfmCOD1G/KCf9Lid8PLg3LoEafPXtwsf8 gzRQ== MIME-Version: 1.0 Received: by 10.14.219.72 with SMTP id l48mr184430338eep.37.1357751907276; Wed, 09 Jan 2013 09:18:27 -0800 (PST) Received: by 10.14.177.71 with HTTP; Wed, 9 Jan 2013 09:18:27 -0800 (PST) In-Reply-To: References: Date: Thu, 10 Jan 2013 01:18:27 +0800 Message-ID: Subject: Re: distribution of token ranges with virtual nodes From: Manu Zhang To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b62212420580c04d2de404a X-Virus-Checked: Checked by ClamAV on apache.org --047d7b62212420580c04d2de404a Content-Type: text/plain; charset=ISO-8859-1 Is cassandra-shuffle command in the trunk? Or it is only included in the Debian package? I don't find it in the trunk. On Sat, Nov 3, 2012 at 2:18 AM, Eric Evans wrote: > On Fri, Nov 2, 2012 at 12:38 AM, Manu Zhang > wrote: > >> It splits into a contiguous range, because truly upgrading to vnode > >> functionality is another step. > > > > That confuses me. As I understand it, there is no point in having 256 > tokens > > on same node if I don't commit the shuffle > > This isn't exactly true. By-partition operations (think repair, > streaming, etc) will be more reliable in the sense that if they fail > and need to be restarted, there is less that is lost/needs redoing. > Also, if all you did was migrate from 1-token-per-node to 256 > contiguous tokens per node, normal topology changes (bootstrapping new > nodes, decommissioning old ones), would gradually work to redistribute > the partitions. And, from a topology perspective, splitting the one > partition into many contiguous partition is a no-op; it's safe to do > and there is no cost to speak of from a computational or IO > perspective. > > On the other hand, shuffling requires moving tokens around the > cluster. If you completely randomize placement, it follows that you > will need to relocate all of the clusters data, so it's quite costly. > It's also precedent setting, and not thoroughly tested yet. > > -- > Eric Evans > Acunu | http://www.acunu.com | @acunu > --047d7b62212420580c04d2de404a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Is cassandra-shuffle command in the trunk? Or it is only i= ncluded in the Debian package? I don't find it in the trunk.


On Sat, Nov 3, 201= 2 at 2:18 AM, Eric Evans <eevans@acunu.com> wrote:
On Fri, Nov 2, 2012 at 12:= 38 AM, Manu Zhang <owenzhang1= 990@gmail.com> wrote:
>> It splits into a contiguous range, because truly upgrading to vnod= e
>> functionality is another step.
>
> That confuses me. As I understand it, there is no point in having 256 = tokens
> on same node if I don't commit the shuffle

This isn't exactly true. =A0By-partition operations (think repair= ,
streaming, etc) will be more reliable in the sense that if they fail
and need to be restarted, there is less that is lost/needs redoing.
Also, if all you did was migrate from 1-token-per-node to 256
contiguous tokens per node, normal topology changes (bootstrapping new
nodes, decommissioning old ones), would gradually work to redistribute
the partitions. =A0And, from a topology perspective, splitting the one
partition into many contiguous partition is a no-op; it's safe to do and there is no cost to speak of from a computational or IO
perspective.

On the other hand, shuffling requires moving tokens around the
cluster. =A0If you completely randomize placement, it follows that you
will need to relocate all of the clusters data, so it's quite costly. It's also precedent setting, and not thoroughly tested yet.

--
Eric Evans
Acunu | http://www.acunu= .com | @acunu

--047d7b62212420580c04d2de404a--