Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7DAF04266 for ; Wed, 15 Jun 2011 18:15:34 +0000 (UTC) Received: (qmail 11610 invoked by uid 500); 15 Jun 2011 18:15:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 11558 invoked by uid 500); 15 Jun 2011 18:15:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 11550 invoked by uid 99); 15 Jun 2011 18:15:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Jun 2011 18:15:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vijay2win@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bw0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Jun 2011 18:15:25 +0000 Received: by bwz13 with SMTP id 13so821899bwz.31 for ; Wed, 15 Jun 2011 11:15:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=+cfA2BVBwOTd+8Dl8zeNpC87hcG9OSCEwqJVhDG5/SY=; b=cQWBaLUpqRMzNZgo73wJeZ5O/pXZSQ4IC3FdCU7cSZiIvlaWt3HB7Wv+62LmGQqtgR KjTOwIOWJCXJq1FjqU57tOW6ku+wV/h3icvSZETHHHS8PJGjOX1frboJDk/fBehtxTKd 8ioWekUcSPxSmIWNayAIRkuMMJQH9kE6mEzNM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=fOekYh/iVfuNzL/HE1ty4A8MMkRQkC9/kjDXAusueVplvfCqKn1gwYb5Cw9CVJn97s ehENH7O+uPSVJ95uJqagjS+anP+nq77Je0gVbHnS2LEl+Sjz2dmWHOmdAUh7A1HUFOO+ Q6ystOfH2/oRspJvyRWywB9J21CNs4yP01WK8= Received: by 10.204.47.68 with SMTP id m4mr31787bkf.54.1308161705093; Wed, 15 Jun 2011 11:15:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.114.143 with HTTP; Wed, 15 Jun 2011 11:14:45 -0700 (PDT) In-Reply-To: References: <4DF7E4E5.90709@dude.podzone.net> <4DF8197F.1040501@dude.podzone.net> From: Vijay Date: Wed, 15 Jun 2011 11:14:45 -0700 Message-ID: Subject: Re: Docs: Token Selection To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0016e6dab179bdaa6b04a5c4212b X-Virus-Checked: Checked by ClamAV on apache.org --0016e6dab179bdaa6b04a5c4212b Content-Type: text/plain; charset=UTF-8 Correction.... "The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1" should be "The problem in the above approach is you have 1 node between 0-4 (25%) and and one node covering the rest which is 4-16, 0-0 (75%)" Regards, On Wed, Jun 15, 2011 at 11:10 AM, Vijay wrote: > The problem in the above approach is you have 2 nodes between 12 to 4 in > DC1 but from 4 to 12 you just have 1.... (Which will cause uneven > distribution of data the node) > It is easier to think of the DCs as ring and split equally and interleave > them together.... > > DC1 Node 1 : token 0 > DC1 Node 2 : token 8.. > > DC2 Node 1 : token 4.. > DC2 Node 1 : token 12.. > > Regards, > > > > > > On Tue, Jun 14, 2011 at 7:31 PM, AJ wrote: > >> Yes, which means that the ranges overlap each other. >> >> Is this just a convention, or is it technically required when using >> NetworkTopologyStrategy? Would it be acceptable to split the ranges into >> quarters by ignoring the data centers, such as: >> >> DC1 >> node 1 = 0 Range: (12, 16], (0, 0] >> node 2 = 4 Range: (0, 4] >> >> DC2 >> node 3 = 8 Range: (4, 8] >> node 4 = 12 Range: (8, 12] >> >> If this is OK, are there any drawbacks to this? >> >> >> >> On 6/14/2011 6:10 PM, Vijay wrote: >> >> Yes... Thats right... If you are trying to say the below... >> >> DC1 >> Node1 Owns 50% >> >> (Ranges 8..4 -> 8..5 & 8..5 -> 0) >> >> Node2 Owns 50% >> >> (Ranges 0 -> 1 & 1 -> 8..4) >> >> >> DC2 >> Node1 Owns 50% >> >> (Ranges 8..5 -> 0 & 0 -> 1) >> >> Node2 Owns 50% >> >> (Ranges 1 -> 8..4 & 8..4 -> 8..5) >> >> >> Regards, >> >> >> >> >> On Tue, Jun 14, 2011 at 3:47 PM, AJ wrote: >> >>> This http://wiki.apache.org/cassandra/Operations#Token_selection says: >>> >>> "With NetworkTopologyStrategy, you should calculate the tokens the nodes >>> in each DC independantly." >>> >>> and gives the example: >>> >>> DC1 >>> node 1 = 0 >>> node 2 = 85070591730234615865843651857942052864 >>> >>> DC2 >>> node 3 = 1 >>> node 4 = 85070591730234615865843651857942052865 >>> >>> >>> So, according to the above, the token ranges would be (abbreviated nums): >>> >>> DC1 >>> node 1 = 0 Range: (8..4, 16], (0, 0] >>> node 2 = 8..4 Range: (0, 8..4] >>> >>> DC2 >>> node 3 = 1 Range: (8..5, 16], (0, 1] >>> node 4 = 8..5 Range: (1, 8..5] >>> >>> >>> If the above is correct, then I would be surprised as this paragraph is >>> the only place were one would discover this and may be easy to miss... >>> unless there's a doc buried somewhere in plain view that I missed. >>> >>> So, have I interpreted this paragraph correctly? Was this design to help >>> keep data somewhat localized if that was important, such as a geographically >>> dispersed DC? >>> >>> Thanks! >>> >> >> >> > --0016e6dab179bdaa6b04a5c4212b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Correction....=C2=A0

"The problem in the above=C2= =A0approach=C2=A0is you have 2 nodes between 12 to 4 in DC1 but from 4 to 1= 2 =C2=A0you just have 1"

should be
=
"The problem in the above=C2=A0approach=C2=A0is you hav= e 1 node between 0-4 (25%) and and one node covering the rest which is 4-16= , 0-0 (75%)"

Rega= rds,
</VJ>



On Wed, Jun 15, 2011 at 11:10 AM, Vijay = <vijay2win@gmai= l.com> wrote:
The problem in the above=C2=A0approach=C2=A0is you have 2 nodes between 12 = to 4 in DC1 but from 4 to 12 =C2=A0you just have 1.... (Which will cause un= even distribution of data the node)
It is easier to think of the D= Cs as ring and split equally and interleave them together....

DC1 Node 1 : token 0
DC1 Node 2 : token 8..

DC2 Node 1 : token 4..
DC2 Node 1 : token= 12..

Regards,
</VJ>




On Tue, Jun 14, 2011 at 7:31 PM, AJ <= aj@dude.podzone.net> wrote:
=20 =20 =20
Yes, which means that the ranges overlap each other.=C2=A0

Is this just a convention, or is it technically required when using NetworkTopologyStrategy?=C2=A0 Would it be acceptable to split the rang= es into quarters by ignoring the data centers, such as:

DC1
node 1 =3D 0 =C2=A0 =C2=A0 =C2=A0Range: (12, 16], (0, 0]
node 2 =3D 4 =C2=A0=C2=A0=C2=A0=C2=A0 Range: (0, 4]

DC2
node 3 =3D 8 =C2=A0 =C2=A0=C2=A0 Range: (4, 8]
node 4 =3D 12 =C2=A0 Range: (8, 12]

If this is OK, are there any drawbacks to this?=C2=A0



On 6/14/2011 6:10 PM, Vijay wrote:
Yes... Thats right... =C2=A0If you are trying= to say the below...

DC1
=20
Node1 Owns 50%=C2=A0
(Ranges=C2=A08..4 -> 8..5 &=C2=A08..5 -> 0)
Node2 Owns 50%=C2=A0
(Ranges 0 -> 1 & 1 -> 8..4)

DC2
=20
Node1 Owns 50%=C2=A0
(Ranges 8..5 -> 0 & 0 -> 1)
=20
=20
Node2 Owns 50%=C2=A0
(Ranges 1 -> 8..4 & 8..4 -> 8..5)

Regards,
</VJ>



On Tue, Jun 14, 2011 at 3:47 PM, AJ <aj@dude.podzone.net> wrote:
This http://wiki.apache.org/cassandra/Operatio= ns#Token_selection =C2=A0says:

"With NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independantly."

and gives the example:

DC1
node 1 =3D 0
node 2 =3D 85070591730234615865843651857942052864

DC2
node 3 =3D 1
node 4 =3D 85070591730234615865843651857942052865


So, according to the above, the token ranges would be (abbreviated nums):

DC1
node 1 =3D 0 =C2=A0 =C2=A0 =C2=A0Range: (8..4, 16], (0, 0] node 2 =3D 8..4 =C2=A0 Range: (0, 8..4]

DC2
node 3 =3D 1 =C2=A0 =C2=A0 =C2=A0Range: (8..5, 16], (0, 1] node 4 =3D 8..5 =C2=A0 Range: (1, 8..5]


If the above is correct, then I would be surprised as this paragraph is the only place were one would discover this and may be easy to miss... unless there's a doc buried somewhere in plain view that I missed.

So, have I interpreted this paragraph correctly? =C2=A0Was th= is design to help keep data somewhat localized if that was important, such as a geographically dispersed DC?

Thanks!




--0016e6dab179bdaa6b04a5c4212b--