Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 017CDE4C1 for ; Tue, 5 Feb 2013 16:44:24 +0000 (UTC) Received: (qmail 75654 invoked by uid 500); 5 Feb 2013 16:44:21 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 75334 invoked by uid 500); 5 Feb 2013 16:44:20 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 75317 invoked by uid 99); 5 Feb 2013 16:44:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 16:44:20 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.214.52] (HELO mail-bk0-f52.google.com) (209.85.214.52) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 16:44:12 +0000 Received: by mail-bk0-f52.google.com with SMTP id jk13so181450bkc.39 for ; Tue, 05 Feb 2013 08:43:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:x-originating-ip:date:message-id:subject :from:to:content-type:x-gm-message-state; bh=9WhtGx7r/+kfJHhPrQRFFruFTYiSik7djIkWcJFf+20=; b=nbHoWWxREFMM1MQcHQtKpsuZ9tPlNEJabsi5qpW6PMZUfOuhHR1Ev/bo7bSbqmHWFx BiJu5uvlblVMOYpikhw4rI4Q7ogaE3lIepGocT3u6w7KCGTKR3OXc4SxBuwbHw30xxWS nokmJxGTopZF6phzwKlY0SX26QpjWZZAT9QAi72iRN8ih5B2uOlqUEG7TqX50y76ca9h jN0kdOt1iP5Ef7iGuNOjr/B+oz6Vj0Yb45DX+0tHOIyqNFr6n0+h5ZgBaR7DQGUI5xgW AW0hiAUcPcP28xF3aN7DtZP7oi/P7QYm7RBn1PnV4HKgPkiGIfyOl0MGrIsm1OkcaL3A ho/w== MIME-Version: 1.0 X-Received: by 10.204.8.20 with SMTP id f20mr6860945bkf.12.1360082630437; Tue, 05 Feb 2013 08:43:50 -0800 (PST) Received: by 10.204.197.200 with HTTP; Tue, 5 Feb 2013 08:43:50 -0800 (PST) X-Originating-IP: [71.62.148.58] Date: Tue, 5 Feb 2013 11:43:50 -0500 Message-ID: Subject: Clarification on num_tokens setting From: Baron Schwartz To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=00151750e9a60d6aea04d4fceab5 X-Gm-Message-State: ALoCoQnzZKYgpzu14eh6l+9XAVU+miHFl4dbTPB3zW7GArRGupyZbw6fdkFeU0tm6z4EmeVRO1tD X-Virus-Checked: Checked by ClamAV on apache.org --00151750e9a60d6aea04d4fceab5 Content-Type: text/plain; charset=ISO-8859-1 As I understand the num_tokens setting, it makes Cassandra do the following pseudocode when a new node is added: for 1...num_tokens do my_token = rand(0, 2^128-1) next_token = min(tokens in cluster where token > my_token) my_range = (my_token, next_token - 1) done Now the new node owns num_tokens chunks of keys that previously belonged to other nodes. My point is, with 1 node in the cluster, the ring is divided into num_tokens ranges. With N nodes, the ring is divided into N*num_tokens. Correct? The docs do not make this clear for me. And another point: the tokens are randomly chosen, so the ranges of keys are not uniform, although with enough nodes in the cluster there probably won't be any really large ranges. Correct? --00151750e9a60d6aea04d4fceab5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
As I understand the num_tokens setting, it makes Cassandra= do the following pseudocode when a new node is added:

for 1...num_tokens do
=A0 =A0my_token =3D rand(0, 2^12= 8-1)
=A0 =A0next_token =3D min(tokens in cluster where token > my_= token)
=A0 =A0my_range =3D (my_token, next_token - 1)
=
done

Now the new node owns = num_tokens chunks of keys that previously belonged to other nodes.

My point is, with 1 node in the cluster, th= e ring is divided into num_tokens ranges. With N nodes, the ring is divided= into N*num_tokens. Correct? The docs do not make this clear for me.

And another point: the tokens are randomly = chosen, so the ranges of keys are not uniform, although with enough nodes i= n the cluster there probably won't be any really large ranges. Correct?=
--00151750e9a60d6aea04d4fceab5--