From user-return-63056-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org  Thu Jan 24 10:32:55 2019
Return-Path: <user-return-63056-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 2509F18062C
	for <archive-asf-public@cust-asf.ponee.io>; Thu, 24 Jan 2019 10:32:54 +0100 (CET)
Received: (qmail 91308 invoked by uid 500); 24 Jan 2019 09:32:53 -0000
Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:user-help@cassandra.apache.org>
List-Unsubscribe: <mailto:user-unsubscribe@cassandra.apache.org>
List-Post: <mailto:user@cassandra.apache.org>
List-Id: <user.cassandra.apache.org>
Reply-To: user@cassandra.apache.org
Delivered-To: mailing list user@cassandra.apache.org
Received: (qmail 91298 invoked by uid 99); 24 Jan 2019 09:32:53 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jan 2019 09:32:53 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id D87CE1805D1
	for <user@cassandra.apache.org>; Thu, 24 Jan 2019 09:32:52 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 2.798
X-Spam-Level: **
X-Spam-Status: No, score=2.798 tagged_above=-999 required=6.31
	tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
	DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_REPLY=1,
	HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001]
	autolearn=disabled
Authentication-Results: spamd3-us-west.apache.org (amavisd-new);
	dkim=pass (2048-bit key) header.d=gmail.com
Received: from mx1-lw-us.apache.org ([10.40.0.8])
	by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024)
	with ESMTP id QxJEjk6R79Y3 for <user@cassandra.apache.org>;
	Thu, 24 Jan 2019 09:32:51 +0000 (UTC)
Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173])
	by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 23C6961116
	for <user@cassandra.apache.org>; Thu, 24 Jan 2019 09:32:51 +0000 (UTC)
Received: by mail-pl1-f173.google.com with SMTP id a14so2606552plm.12
        for <user@cassandra.apache.org>; Thu, 24 Jan 2019 01:32:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
        bh=QeKwicJlC31eOeLeNuqq73A7jLKHaDdIwD0IqU7td4I=;
        b=OtZFA65odYj+IYMACViMuyccY8GkVMZJOkT5Px6mP/aI/LslnpNpbFcb5YDBoL2CNc
         fJnNRvN78U4WvdVYo9SvUQNG85NQgtkDYKJetIOom90dyZo/x5ryGgvmiNgSCJV/84z0
         fkT03XnzsI16egiytHMIcnC8Bl3+EwG6JjyQxoEfBMR17OgMfOpavUZscub7BLO/TT1+
         SZq3symK7QIEIp9MGBBTEVBNfl53Yi7eG5MqeGvPhlY1NDp70+kogDzSIl9dSyqjLs1f
         +LlLox7JOjX+maI2duaiMTKt+ARv0C88DQ5lELcFEyOMUKykspztoeIPbwk6O61s3r9x
         aVFA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to;
        bh=QeKwicJlC31eOeLeNuqq73A7jLKHaDdIwD0IqU7td4I=;
        b=oAnEXaU5qYFXrP4gtN+PzL2DfKktPyJecbbR67WBNjieBCbH/UgOywZ2/tl06wn/LI
         zQZJ+xUFuEI4bTKrTch6yEL0+hU4MCP/+Ph6HM5rlMfQAOCN95+UEIXSRrRTV6F22t7R
         VJXObTvUfzNg2vcXWT1FX2Xej55xSeiFqsQw50iSB+allJfi6899z8CajVo7V8q7gIuq
         ibM9pjZxRkbzWHuZqJjX7FB64uVQtZLfW5UsDWa/2nlRW4txEvPT0thM0BrS/ueX9157
         nziT9QkyRsjUrr8oB95KLsUFkZM4+IM072DK3xlLfetuXJibezODIq0N8wiJdp/c3ge5
         NhQg==
X-Gm-Message-State: AJcUukdiY+B4cKzhvawHdxcHzwU/Fdt/ebxEwkMVRMwRfU3m6ILoSe7D
	hsWZ8GucRM7CH7uD5wiJEYbPMvo8TShUobZDlLtHhqwX
X-Google-Smtp-Source: ALg8bN6c9WeQmwBm7TnA84/n7hfFPu35tTxEbPTrLsnga/qaCW9NOYBlpI9XGZ1awNefnpXIC0oY+KPUC3abhgHGP7c=
X-Received: by 2002:a17:902:b68d:: with SMTP id c13mr5871188pls.102.1548322364009;
 Thu, 24 Jan 2019 01:32:44 -0800 (PST)
MIME-Version: 1.0
References: <1662e0e64bd.125be70e941930.7215874111596348455@zoho.com>
 <CA+VSrLrWFm6-JrQ1X+WDMwhrMqe+qQHZtPdtYcydDY13YqZzig@mail.gmail.com>
 <1662f227495.c503490a45862.6653275483034403400@zoho.com> <CACACo5TDW+G=3s0SBwWvaw9M86k31OnuCOdgr055pdicp9TokA@mail.gmail.com>
 <1662f3474ad.bf2a8d5046144.4729157283643333394@zoho.com> <CA+VSrLoZOoQGQr5jXzcpMaxuMsKXHMG2L1-C83o-1xdf86XGYw@mail.gmail.com>
 <166337908a8.104edcdb466026.8375725610568926021@zoho.com> <CAGJhWaXqCmOPUY0k0Ab1fRT_w-Z02uNAniAs5877+kAYfjJAnQ@mail.gmail.com>
In-Reply-To: <CAGJhWaXqCmOPUY0k0Ab1fRT_w-Z02uNAniAs5877+kAYfjJAnQ@mail.gmail.com>
From: Ahmed Eljami <ahmed.eljami@gmail.com>
Date: Thu, 24 Jan 2019 10:32:31 +0100
Message-ID: <CAJjuABap+k63=d9+3FtRXbFVvy5-uxkg1VCrkBfc7Poz_49o3g@mail.gmail.com>
Subject: Re: Re: Re: how to configure the Token Allocation Algorithm
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary="000000000000813112058030e462"

--000000000000813112058030e462
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi folks,

What about adding new keyspaces in the existing cluster, test_2 with the
same RF.

It will use the same logic as the existing kesypace test ? Or I should
restart nodes and add the new keyspace to the cassandra.yaml ?

Thanks.

Le mar. 2 oct. 2018 =C3=A0 10:28, Varun Barala <varunbarala99@gmail.com> a
=C3=A9crit :

> Hi,
>
> Managing `initial_token` by yourself will give you more control over
> scale-in and scale-out.
> Let's say you have three node cluster with `num_token: 1`
>
> And your initial range looks like:-
>
> Datacenter: datacenter1
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> Address    Rack        Status State   Load            Owns
>  Token
>
>                                3074457345618258602
> 127.0.0.1  rack1       Up     Normal  98.96 KiB       66.67%
>  -9223372036854775808
> 127.0.0.2  rack1       Up     Normal  98.96 KiB       66.67%
>  -3074457345618258603
> 127.0.0.3  rack1       Up     Normal  98.96 KiB       66.67%
>  3074457345618258602
>
> Now let's say you want to scale out the cluster to twice the current
> throughput(means you are adding 3 more nodes)
>
> If you are using AWS EBS volumes then you can use the same volumes and
> spin three more nodes by selecting midpoints of existing ranges which mea=
ns
> your new nodes are already having data.
> Once you have mounted volumes on your new nodes:-
> * You need to delete every system table except schema related tables.
> * You need to generate system/local table by yourself which has `Bootstra=
p
> state` as completed and schema-version same as other existing nodes.
> * You need to remove extra data on all the machines using cleanup command=
s
>
> This is how you can scale out Cassandra cluster in the minutes. In case
> you want to add nodes one by one then you need to write some small tool
> which will always figure out the bigger range in the existing cluster and
> will split it into the half.
>
> However, I never tested it thoroughly but this should work conceptually.
> So here we are taking advantage of the fact that we have volumes(data) fo=
r
> the new node beforehand so we no need to bootstrap them.
>
> Thanks & Regards,
> Varun Barala
>
> On Tue, Oct 2, 2018 at 2:31 PM onmstester onmstester <onmstester@zoho.com=
>
> wrote:
>
>>
>>
>> Sent using Zoho Mail <https://www.zoho.com/mail/>
>>
>>
>> ---- On Mon, 01 Oct 2018 18:36:03 +0330 *Alain RODRIGUEZ
>> <arodrime@gmail.com <arodrime@gmail.com>>* wrote ----
>>
>> Hello again :),
>>
>> I thought a little bit more about this question, and I was actually
>> wondering if something like this would work:
>>
>> Imagine 3 node cluster, and create them using:
>> For the 3 nodes: `num_token: 4`
>> Node 1: `intial_token: -9223372036854775808, -4611686018427387905, -2,
>> 4611686018427387901`
>> Node 2: `intial_token: -7686143364045646507, -3074457345618258604,
>> 1537228672809129299, 6148914691236517202`
>> Node 3: `intial_token: -6148914691236517206, -1537228672809129303,
>> 3074457345618258600, 7686143364045646503`
>>
>>  If you know the initial size of your cluster, you can calculate the
>> total number of tokens: number of nodes * vnodes and use the
>> formula/python code above to get the tokens. Then use the first token fo=
r
>> the first node, move to the second node, use the second token and repeat=
.
>> In my case there is a total of 12 tokens (3 nodes, 4 tokens each)
>> ```
>> >>> number_of_tokens =3D 12
>> >>> [str(((2**64 / number_of_tokens) * i) - 2**63) for i in
>> range(number_of_tokens)]
>> ['-9223372036854775808', '-7686143364045646507', '-6148914691236517206',
>> '-4611686018427387905', '-3074457345618258604', '-1537228672809129303',
>> '-2', '1537228672809129299', '3074457345618258600', '4611686018427387901=
',
>> '6148914691236517202', '7686143364045646503']
>> ```
>>
>>
>> Using manual initial_token (your idea), how could i add a new node to a
>> long running cluster (the procedure)?
>>
>>

--=20
Cordialement;

Ahmed ELJAMI

--000000000000813112058030e462
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:verdana,=
sans-serif;color:#0b5394">Hi folks,</div><div class=3D"gmail_default" style=
=3D"font-family:verdana,sans-serif;color:#0b5394"><br></div><div class=3D"g=
mail_default" style=3D"font-family:verdana,sans-serif;color:#0b5394">What a=
bout adding new keyspaces=20
<span class=3D"gmail-tlid-translation gmail-translation"><span title=3D"" c=
lass=3D"gmail-">in the existing cluster</span></span>, test_2 with the same=
 RF.<br></div><div class=3D"gmail_default" style=3D"font-family:verdana,san=
s-serif;color:#0b5394"><br></div><div class=3D"gmail_default" style=3D"font=
-family:verdana,sans-serif;color:#0b5394">It will use the same logic as the=
 existing kesypace test ? Or I should restart nodes and add the new keyspac=
e to the cassandra.yaml ?<br></div><div class=3D"gmail_default" style=3D"fo=
nt-family:verdana,sans-serif;color:#0b5394"><br></div><div class=3D"gmail_d=
efault" style=3D"font-family:verdana,sans-serif;color:#0b5394">Thanks.<br><=
/div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr">Le=C2=A0mar. 2 o=
ct. 2018 =C3=A0=C2=A010:28, Varun Barala &lt;<a href=3D"mailto:varunbarala9=
9@gmail.com">varunbarala99@gmail.com</a>&gt; a =C3=A9crit=C2=A0:<br></div><=
blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l=
eft:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=
=3D"ltr">Hi,<br><br>Managing `initial_token` by yourself will give you more=
 control over scale-in and scale-out.<br>Let&#39;s say you have three node =
cluster with `num_token: 1`<br><br>And your initial range looks like:-<br><=
br><div>Datacenter: datacenter1</div><div>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</d=
iv><div>Address =C2=A0 =C2=A0Rack =C2=A0 =C2=A0 =C2=A0 =C2=A0Status State =
=C2=A0 Load =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Owns =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Token =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03074457345618258602 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0</div><div>127.0.0.1 =C2=A0rack1 =C2=A0 =C2=A0 =C2=A0 Up =C2=A0 =C2=
=A0 Normal =C2=A098.96 KiB =C2=A0 =C2=A0 =C2=A0 66.67% =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0-9223372036854775808 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0</div><div>127.0=
.0.2 =C2=A0rack1 =C2=A0 =C2=A0 =C2=A0 Up =C2=A0 =C2=A0 Normal =C2=A098.96 K=
iB =C2=A0 =C2=A0 =C2=A0 66.67% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0-3074457345618258603 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0</div><div>127.0.0.3 =C2=A0rack1 =C2=A0 =
=C2=A0 =C2=A0 Up =C2=A0 =C2=A0 Normal =C2=A098.96 KiB =C2=A0 =C2=A0 =C2=A0 =
66.67% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03074457345618258602 =
=C2=A0<br><br>Now let&#39;s say you want to scale out the cluster to twice =
the current throughput(means you are adding 3 more nodes)<br><br>If you are=
 using AWS EBS volumes then you can use the same volumes and spin three mor=
e nodes by selecting midpoints of existing ranges which means your new node=
s are already having data.<br>Once you have mounted volumes on your new nod=
es:-<br>* You need to delete every system table except schema related table=
s.<br>* You need to generate system/local table by yourself which has `Boot=
strap state` as completed and schema-version same as other existing nodes.<=
br>* You need to remove extra data on all the machines using cleanup comman=
ds<br><br>This is how you can scale out Cassandra cluster in the minutes. I=
n case you want to add nodes one by one then you need to write some small t=
ool which will always figure out the bigger range in the existing cluster a=
nd will split it into the half.<br></div><br>However, I never tested it tho=
roughly but this should work conceptually. So here we are taking advantage =
of the fact that we have volumes(data) for the=C2=A0new node beforehand so =
we no need to bootstrap them.<br><br>Thanks &amp; Regards,<br>Varun Barala<=
/div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Tue, Oct 2, 2=
018 at 2:31 PM onmstester onmstester &lt;<a href=3D"mailto:onmstester@zoho.=
com" target=3D"_blank">onmstester@zoho.com</a>&gt; wrote:<br></div><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px=
 solid rgb(204,204,204);padding-left:1ex"><u></u><div><div style=3D"font-si=
ze:10pt;font-family:Verdana,Arial,Helvetica,sans-serif"><div><br></div><div=
><br></div><div id=3D"gmail-m_-1343294628846991275m_-2273525922938214964Zm-=
_Id_-Sgn"><p><span class=3D"gmail-m_-1343294628846991275m_-2273525922938214=
964colour" style=3D"color:rgb(42,42,42)">Sent using <a href=3D"https://www.=
zoho.com/mail/" style=3D"color:rgb(89,143,222)" target=3D"_blank">Zoho Mail=
</a></span><br></p></div><div><br></div><div class=3D"gmail-m_-134329462884=
6991275m_-2273525922938214964zmail_extra"><div id=3D"gmail-m_-1343294628846=
991275m_-2273525922938214964Zm-_Id_-Sgn1"><div><br></div><div>---- On Mon, =
01 Oct 2018 18:36:03 +0330=C2=A0<b>Alain RODRIGUEZ &lt;<a href=3D"mailto:ar=
odrime@gmail.com" target=3D"_blank">arodrime@gmail.com</a>&gt;</b> wrote --=
--<br></div></div><div><br></div><blockquote style=3D"border-left:1px solid=
 rgb(204,204,204);padding-left:6px;margin:0px 0px 0px 5px"><div><div dir=3D=
"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><=
div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Hello again :),<br></div>=
<div dir=3D"ltr"><br></div><div dir=3D"ltr">I thought a little bit more abo=
ut this question, and I was actually wondering=C2=A0if something like this =
would work:<br></div><div dir=3D"ltr"><br></div><div dir=3D"ltr"><div dir=
=3D"ltr">Imagine 3 node cluster, and create them using:<br></div><div dir=
=3D"ltr">For the 3 nodes: `num_token: 4`<br></div><div dir=3D"ltr">Node 1: =
`intial_token: -9223372036854775808, -4611686018427387905, -2, 461168601842=
7387901`<br></div><div dir=3D"ltr">Node 2: `intial_token: -7686143364045646=
507, -3074457345618258604, 1537228672809129299, 6148914691236517202`<br></d=
iv><div dir=3D"ltr">Node 3: `intial_token: -6148914691236517206, -153722867=
2809129303, 3074457345618258600, 7686143364045646503`<br></div><div dir=3D"=
ltr"><br></div><div>=C2=A0If you know the initial size of your cluster, you=
 can calculate the total number of tokens:=C2=A0<span style=3D"display:inli=
ne;border-bottom:2px solid transparent;background-repeat:no-repeat">number<=
/span>=C2=A0of nodes *=C2=A0<span style=3D"display:inline;border-bottom:2px=
 solid transparent;background-repeat:no-repeat">vnodes</span>=C2=A0and use =
the formula/python code above to get the tokens. Then use the first token f=
or the first node, move to the second node, use the second token and repeat=
. In my case there is a total of 12 tokens (3 nodes, 4 tokens each)<br></di=
v><div dir=3D"ltr">```<br></div><div dir=3D"ltr">&gt;&gt;&gt; number_of_tok=
ens =3D 12<br></div><div dir=3D"ltr">&gt;&gt;&gt; [str(((2**64 / number_of_=
tokens) * i) - 2**63) for i in range(number_of_tokens)]<br></div><div dir=
=3D"ltr">[&#39;-9223372036854775808&#39;, &#39;-7686143364045646507&#39;, &=
#39;-6148914691236517206&#39;, &#39;-4611686018427387905&#39;, &#39;-307445=
7345618258604&#39;, &#39;-1537228672809129303&#39;, &#39;-2&#39;, &#39;1537=
228672809129299&#39;, &#39;3074457345618258600&#39;, &#39;46116860184273879=
01&#39;, &#39;6148914691236517202&#39;, &#39;7686143364045646503&#39;]<br><=
/div><div dir=3D"ltr">```<br></div><div dir=3D"ltr"><br></div></div></div><=
/div></div></div></div></div></div></div></blockquote></div><div><br></div>=
<div>Using manual initial_token (your idea), how could i add a new node to =
a long running cluster (the procedure)?</div></div><br></div></blockquote><=
/div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"g=
mail_signature"><div dir=3D"ltr"><div><span style=3D"color:rgb(11,83,148)">=
<span style=3D"font-family:verdana,sans-serif">Cordialement;<br><br></span>=
</span></div><span style=3D"font-family:verdana,sans-serif"><span style=3D"=
color:rgb(11,83,148)">Ahmed ELJAMI</span><br></span></div></div>

--000000000000813112058030e462--