From cassandra-user-return-734-apmail-incubator-cassandra-user-archive=incubator.apache.org@incubator.apache.org Thu Oct 01 16:27:36 2009 Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 49980 invoked from network); 1 Oct 2009 16:27:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Oct 2009 16:27:36 -0000 Received: (qmail 35875 invoked by uid 500); 1 Oct 2009 16:27:36 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 35853 invoked by uid 500); 1 Oct 2009 16:27:35 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 35844 invoked by uid 99); 1 Oct 2009 16:27:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 16:27:35 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ikatkov@gmail.com designates 209.85.218.226 as permitted sender) Received: from [209.85.218.226] (HELO mail-bw0-f226.google.com) (209.85.218.226) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 16:27:25 +0000 Received: by bwz26 with SMTP id 26so267558bwz.12 for ; Thu, 01 Oct 2009 09:27:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:from:date:message-id :subject:to:content-type; bh=ObXQBfUhB6gciazZtaU5/VPrnu4UaSWGLVvKhcSU5ps=; b=pFXawW6I5KnqPD2sHn7JeUGIjiaMGMSAEHAtMgR958iz9E/HvbkIaVoS2flpuwiFpE hxHtho+4geRSFmp1J5eOhxUTT5py5pMdKglxrCdyuogfoLS5thkl9CIn70x1HosW8vVV 7ywDOmfUB60RJzOQpFLS9jXfkW1mvB6/d7jhw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=UeysQ4OkWocVc0QLqrH5nDQdqxB9zGDgk2JTerzoK24aWFUELGZH6YDWeTXKRg/cfW qPcHhdgpPmYXX7yvLXw6U/pFzRobn+Yi5ixiQ96SkH/blUfSd71H8ipxPqysm/HgYb7l ojdWKtFxuJeIfLcuU7jvOAUfMrHQkAKtPFco4= MIME-Version: 1.0 Received: by 10.223.14.22 with SMTP id e22mr414669faa.42.1254414425131; Thu, 01 Oct 2009 09:27:05 -0700 (PDT) From: Igor Katkov Date: Thu, 1 Oct 2009 12:26:45 -0400 Message-ID: <23b1e84e0910010926w65e08b7dke62d6c615e441645@mail.gmail.com> Subject: distributing tokens equally along the key distribution space To: cassandra-user@incubator.apache.org Content-Type: multipart/alternative; boundary=0015173fe52a3609be0474e21ead X-Virus-Checked: Checked by ClamAV on apache.org --0015173fe52a3609be0474e21ead Content-Type: text/plain; charset=ISO-8859-1 Hi, Question#1: How to manually select tokens to force equal spacing of tokens around the hash space? If RandomPartitioner is used a token is a BigInteger, so there are no [0, Max value] interval to select token values from. If everything is left to defaults, a token is a random number (hash of GUID) so these 10 generated tokens will not be evenly spaced on the ring. Suppose I have X nodes, what would be correct token values? Question#2: Let's assume that #1 was resolved somehow and key distribution is more or less even. A new node "C" joins the cluster. It's token falls somewhere between two other tokens on the ring (from nodes "A" and "B" clockwise-ordered). From now on "C" is responsible for a portion of data that used to exclusively belong to "B". a. Cassandra v.0.4 will not automatically transfer this data to "C" will it? b. Do all reads to these keys fail? c. What happens with the data reference by these keys on "B"? It will never be accessed there, therefor it becomes garbage. Since there are to GC will it stick forever? d. What happens to replicas of these keys? --0015173fe52a3609be0474e21ead Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi,

Question#1:
How to manually select tokens to force equal spac= ing of tokens around the hash space?
If RandomPartitioner is used a tok= en is a BigInteger, so there are no [0, Max value] interval to select token= values from.
If everything is left to defaults, a token is a random number (hash of GUID= ) so these 10 generated tokens will not be evenly spaced on the ring.
Su= ppose I have X nodes, what would be correct token values?

Question#= 2:
Let's assume that #1 was resolved somehow and key distribution is more = or less even.
A new node "C" joins the cluster. It's token= falls somewhere between two other tokens on the ring (from nodes "A&q= uot; and "B" clockwise-ordered). From now on "C" is res= ponsible for a portion of data that used to exclusively belong to "B&q= uot;.
a. Cassandra v.0.4 will not automatically transfer this data to "C&quo= t; will it?
b. Do all reads to these keys fail?
c. What happens wit= h the data reference by these keys on "B"? It will never be acces= sed there, therefor it becomes garbage. Since there are to GC will it stick= forever?
d. What happens to replicas of these keys?

--0015173fe52a3609be0474e21ead--