Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CADeXowiKoRsEja0ysomONni81T6kXp2zvHEBOmfzstyPcFPkaQ@mail.gmail.com>
References: 
 <CADeXowjT2-T8s+EDDv0y67EBSfmKae=AvUyQ5EmjbH1-0EqFXw@mail.gmail.com>
 <38E0A7FA71EE504B8D561B490E87E4EB5C9298B6@SR8-EX10-N1.intra.genapi.fr>
 <CADeXowgB8e-OX=-jtrk-WzmFkYWxKw3Od0Z6Z_-y6U4rC--hUA@mail.gmail.com>
 <CADeXowiKeDZvyHRcXMVoe=RMn98t6uLBSq=AC3mNOnzkNAPNVg@mail.gmail.com>
 <38E0A7FA71EE504B8D561B490E87E4EB5C929CE7@SR8-EX10-N1.intra.genapi.fr>
 <CA+VSrLpoHrwJSgN1qCa8=+fjkyKvmPqV_y1N44sGP8y+i=S8ig@mail.gmail.com>
 <CAGptfvb1=r88qD0MXVUpyffzs-wk9pnLSaX7au=sweg-FgH8Yw@mail.gmail.com>
 <CADeXowhyW-B+1P0K3c8rAVHVyAPE4Rad-NhNB-F2CEiS6G2CNQ@mail.gmail.com>
 <CADeXowiKoRsEja0ysomONni81T6kXp2zvHEBOmfzstyPcFPkaQ@mail.gmail.com>
From: Alain RODRIGUEZ <arodrime@gmail.com>
Date: Mon, 7 Sep 2015 18:27:07 +0200
Message-ID: 
 <CA+VSrLrPHU19w8FPLo=LzPLE5ngMwzYDFD830S1aajrLL72yAw@mail.gmail.com>
Subject: Re: cassandra scalability
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=089e011832969c6846051f2ab9a5

--089e011832969c6846051f2ab9a5
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Glad to hear that.

About the Proxy

Putting an HA proxy in front of Cassandra is an anti-pattern as you create
a single point of failure. You can multiply them, but why would you do that
anyway when most (all ?) of the modern Cassandra clients handle this for
you through TokenAware /  RoundRobin / LatencyAware strategies ?

About the tokens:

You can use vnodes (set "num_tokens" to 64 or 128 let's say, comment
"#initial_token") and let Cassandra handle node/tokens distribution
combined to a Murmur3Partitioner and you're all set. Not sure what you're
doing with IPs, still don't get it but it is either useless or even bad
from what I understand.

Finally (as you find out) you can't auto_bootstrap a seed node indeed.

Hope this help.

C*heers,

Alain


2015-09-07 18:11 GMT+02:00 ICHIBA Sara <ichi.sara@gmail.com>:

> I think I know where my problem is coming from. I took a look at the log
> of cassandra on each node and I saw something related to bootstrap. it sa=
ys
> that the node is a seed so there will be no bootstraping. Actually I made=
 a
> mistake. in the cassandra.yaml file each node have two ips as seeds. the =
ip
> of the machine itself and the ip of the real seed server. Once i removed
> the local IP the problem seems to be fixed.
>
> 2015-09-07 18:01 GMT+02:00 ICHIBA Sara <ichi.sara@gmail.com>:
>
>> Thank you all for your answers.
>>
>> @Alain:
>> Can you detail actions performed,
>> >>like how you load data
>> >>>i have a haproxy in front of my cassandra database, so i'm sure that
>> my application queries each time a different coordinator
>>
>> >>what scaleup / scaledown are and precise if you let it decommission
>> fully (streams finished, node removed from nodetool status)
>> >>> i'm using openstack platform to autoscale cassandra cluster.
>> Actually, in openstack, the combination of ceilometer + heat allow to us=
ers
>> to automate the deployment of their applications and supervise their
>> resources. they can order the scale up (adding of new nodes automaticall=
y)
>> when resources(cpu, ram,...) are needed or scaledown (remove unecessary =
VMs
>> automatically).
>> so with heat i can spawn automatically a cluster of 2 cassandra VMs
>> (create the cluster and configure each cassandra server with a template)=
.
>> My cluster can go from 2 nodes to 6 based on the workloads. when their i=
s a
>> scaledown action, heat automatically execute a script on my node and
>> decommission it before removing it.
>>
>> >>Also, I am not sure of the meaning of this --> " i'm affecting to each
>> of my node a different token based on there ip address (the token is
>> A+B+C+D and the ip is A.B.C.D)".
>>
>> look at this:
>> [root@demo-server-seed-wgevseugyjd7 ~]# nodetool status bng;
>> Datacenter: DC1
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> Status=3DUp/Down
>> |/ State=3DNormal/Leaving/Joining/Moving
>> --  Address     Load       Tokens  Owns (effective)  Host
>> ID                               Rack
>> UN  40.0.0.149  789.03 KB  189     100.0%
>> bd0b2616-18d9-4bc2-a80b-eebd67474712  RAC1
>> UN  40.0.0.168  300.38 KB  208     100.0%
>> ebd9732b-ebfc-4a6c-b354-d7df860b57b0  RAC1
>>
>> the node with address 40.0.0.149 have the token 189=3D40+0+0+149
>> and the node with address 40.0.0.168 have the token 208=3D40+0+0+168
>>
>> this way i'm sure that each node in my cluster will have a different
>> token. I don't know what will happen if all the node have the same token=
??
>>
>> >>Aren't you using RandomPartitioner or Murmur3Partitioner
>>
>> i'm using the default one which is
>> partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>>
>>
>> in order to configure cassandra on each node i'm using this script
>>
>>       inputs:
>>       - name: IP
>>       - name: SEED
>>       config: |
>>         #!/bin/bash -v
>>         cat << EOF >> /etc/resolv.conf
>>         nameserver 8.8.8.8
>>         nameserver 192.168.5.1
>>         EOF
>>
>>         DEFAULT=3D${DEFAULT:-/etc/cassandra/default.conf}
>>         CONFIG=3D/etc/cassandra/conf
>>         IFS=3D"." read a b c d <<< $IP
>>         s=3D"$[a[0]+b[0]+c[0]+d[0]]"
>>         sed -i -e "s/^cluster_name.*/cluster_name: 'Cassandra cluster fo=
r
>> freeradius'/" $CONFIG/cassandra.yaml
>>         sed -i -e "s/^num_tokens.*/num_tokens: $s/" $CONFIG/cassandra.ya=
ml
>>         sed -i -e "s/^listen_address.*/listen_address: $IP/"
>> $CONFIG/cassandra.yaml
>>         sed -i -e "s/^rpc_address.*/rpc_address: 0.0.0.0/"
>> $CONFIG/cassandra.yaml
>>         sed -i -e "s/^broadcast_address.*/broadcast_address: $IP/"
>> $CONFIG/cassandra.yaml
>>         sed -i -e "s/broadcast_rpc_address.*/broadcast_rpc_address: $IP/=
"
>> $CONFIG/cassandra.yaml
>>         sed -i -e
>> "s/^commitlog_segment_size_in_mb.*/commitlog_segment_size_in_mb: 32/"
>> $CONFIG/cassandra.yaml
>>         sed -i -e "s/# JVM_OPTS=3D\"\$JVM_OPTS
>> -Djava.rmi.server.hostname=3D<public name>\"/JVM_OPTS=3D\"\$JVM_OPTS
>> -Djava.rmi.server.hostname=3D$IP\"/" $CONFIG/cassandra-env.sh
>>         sed -i -e "s/- seeds.*/- seeds: \"$SEED\"/" $CONFIG/cassandra.ya=
ml
>>
>>         sed -i -e "s/^endpoint_snitch.*/endpoint_snitch:
>> GossipingPropertyFileSnitch/" $CONFIG/cassandra.yaml
>>         echo MAX_HEAP_SIZE=3D"4G" >>  $CONFIG/cassandra-env.sh
>>         echo HEAP_NEWSIZE=3D"800M" >> $CONFIG/cassandra-env.sh
>>         service cassandra stop
>>         rm -rf /var/lib/cassandra/data/system/*
>>         service cassandra start
>>
>>
>>
>> 2015-09-07 16:30 GMT+02:00 Ryan Svihla <rs@foundev.pro>:
>>
>>> If that's what tracing is telling you then it's fine and just a product
>>> of data distribution (note your token count isn't identical anyway).
>>>
>>> If you're doing cl one queries directly against particular nodes and
>>> getting different results it sounds like dropped mutations, streaming
>>> errors and or timeouts. Does running repair or reading at CL level all =
give
>>> you an accurate total record count?
>>>
>>> nodetool tpstats should help post bootstrap identify dropped mutations
>>> but you also want to monitor logs for any errors (in general this is al=
ways
>>> good advice for any system).. There could be a myriad or problems with
>>> bootstrapping new nodes, usually this is related to under provisioning.
>>>
>>> On Mon, Sep 7, 2015 at 8:19 AM Alain RODRIGUEZ <arodrime@gmail.com>
>>> wrote:
>>>
>>>> Hi Sara,
>>>>
>>>> Can you detail actions performed, like how you load data, what scaleup
>>>> / scaledown are and precise if you let it decommission fully (streams
>>>> finished, node removed from nodetool status) etc ?
>>>>
>>>> This would help us to help you :).
>>>>
>>>> Also, what happens if you query using "CONSISTENCY LOCAL_QUORUM;" (or
>>>> ALL) before your select ? If not using cqlsh, set the Consistency Leve=
l of
>>>> your client to LOCAL_QUORUM or ALL and try to select again.
>>>>
>>>> Also, I am not sure of the meaning of this --> " i'm affecting to each
>>>> of my node a different token based on there ip address (the token is
>>>> A+B+C+D and the ip is A.B.C.D)". Aren't you using RandomPartitioner or
>>>> Murmur3Partitioner ?
>>>>
>>>> C*heers,
>>>>
>>>> Alain
>>>>
>>>>
>>>>
>>>> 2015-09-07 12:01 GMT+02:00 Edouard COLE <Edouard.COLE@rgsystem.com>:
>>>>
>>>>> Please, don't mail me directly
>>>>>
>>>>> I read your answer, but I cannot help anymore
>>>>>
>>>>> And answering with "Sorry, I can't help" is pointless :)
>>>>>
>>>>> Wait for the community to answer
>>>>>
>>>>> De : ICHIBA Sara [mailto:ichi.sara@gmail.com]
>>>>> Envoy=C3=A9 : Monday, September 07, 2015 11:34 AM
>>>>> =C3=80 : user@cassandra.apache.org
>>>>> Objet : Re: cassandra scalability
>>>>>
>>>>> when there's a scaledown action, i make sure to decommission the node
>>>>> before. but still, I don't understand why I'm having this behaviour. =
is it
>>>>> normal. what do you do normally to remove a node? is it related to to=
kens?
>>>>> i'm affecting to each of my node a different token based on there ip
>>>>> address (the token is A+B+C+D and the ip is A.B.C.D)
>>>>>
>>>>> 2015-09-07 11:28 GMT+02:00 ICHIBA Sara <ichi.sara@gmail.com>:
>>>>> at the biginning it looks like this:
>>>>>
>>>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status
>>>>> Datacenter: DC1
>>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>>>> Status=3DUp/Down
>>>>> |/ State=3DNormal/Leaving/Joining/Moving
>>>>> --  Address     Load       Tokens  Owns    Host
>>>>> ID                               Rack
>>>>> UN  40.0.0.208  128.73 KB  248     ?
>>>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606  RAC1
>>>>> UN  40.0.0.209  114.59 KB  249     ?
>>>>> 84f6f0be-6633-4c36-b341-b968ff91a58f  RAC1
>>>>> UN  40.0.0.205  129.53 KB  245     ?
>>>>> aa233dc2-a8ae-4c00-af74-0a119825237f  RAC1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status
>>>>> service_dictionary
>>>>> Datacenter: DC1
>>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>>>> Status=3DUp/Down
>>>>> |/ State=3DNormal/Leaving/Joining/Moving
>>>>> --  Address     Load       Tokens  Owns (effective)  Host
>>>>> ID                               Rack
>>>>> UN  40.0.0.208  128.73 KB  248     68.8%
>>>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606  RAC1
>>>>> UN  40.0.0.209  114.59 KB  249     67.8%
>>>>> 84f6f0be-6633-4c36-b341-b968ff91a58f  RAC1
>>>>> UN  40.0.0.205  129.53 KB  245     63.5%
>>>>> aa233dc2-a8ae-4c00-af74-0a119825237f  RAC1
>>>>>
>>>>> the result of the query select * from service_dictionary.table1; gave
>>>>> me
>>>>>  70 rows from 40.0.0.205
>>>>> 64 from 40.0.0.209
>>>>> 54 from 40.0.0.208
>>>>>
>>>>> 2015-09-07 11:13 GMT+02:00 Edouard COLE <Edouard.COLE@rgsystem.com>:
>>>>> Could you provide the result of :
>>>>> - nodetool status
>>>>> - nodetool status YOURKEYSPACE
>>>>>
>>>>>
>>>>>
>>>> --
>>> Regards,
>>>
>>> Ryan Svihla
>>
>>
>>
>

--089e011832969c6846051f2ab9a5
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Glad to hear that.<div><br></div><div>About the Proxy<br><=
div><br></div><div>Putting an HA proxy in front of Cassandra is an anti-pat=
tern as you create a single point of failure. You can multiply them, but wh=
y would you do that anyway when most (all ?) of the modern Cassandra client=
s handle this for you through TokenAware / =C2=A0RoundRobin / LatencyAware =
strategies ?</div><div><br></div><div>About the tokens:</div><div><br></div=
><div>You can use vnodes (set &quot;<span style=3D"font-size:12.8px">num_to=
kens&quot; to 64 or 128 let&#39;s say, comment &quot;#initial_token&quot;)<=
/span>=C2=A0and let Cassandra handle node/tokens distribution combined to a=
 Murmur3Partitioner and you&#39;re all set. Not sure what you&#39;re doing =
with IPs, still don&#39;t get it but it is either useless or even bad from =
what I understand.</div><div><br></div><div>Finally (as you find out) you c=
an&#39;t auto_bootstrap a seed node indeed.</div><div><br></div><div>Hope t=
his help.</div><div><br></div><div>C*heers,</div><div><br></div><div>Alain<=
/div><div><br></div></div></div><div class=3D"gmail_extra"><br><div class=
=3D"gmail_quote">2015-09-07 18:11 GMT+02:00 ICHIBA Sara <span dir=3D"ltr">&=
lt;<a href=3D"mailto:ichi.sara@gmail.com" target=3D"_blank">ichi.sara@gmail=
.com</a>&gt;</span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:0=
 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">I t=
hink I know where my problem is coming from. I took a look at the log of ca=
ssandra on each node and I saw something related to bootstrap. it says that=
 the node is a seed so there will be no bootstraping. Actually I made a mis=
take. in the cassandra.yaml file each node have two ips as seeds. the ip of=
 the machine itself and the ip of the real seed server. Once i removed the =
local IP the problem seems to be fixed.<br></div><div class=3D"HOEnZb"><div=
 class=3D"h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">201=
5-09-07 18:01 GMT+02:00 ICHIBA Sara <span dir=3D"ltr">&lt;<a href=3D"mailto=
:ichi.sara@gmail.com" target=3D"_blank">ichi.sara@gmail.com</a>&gt;</span>:=
<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-lef=
t:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div><div><div><di=
v><div><div>Thank you all for your answers.<br><br></div>@Alain: <br><span>=
Can you detail actions performed, <br>&gt;&gt;like how you load data<br></s=
pan>&gt;&gt;&gt;i have a haproxy in front of my cassandra database, so i=
9;m sure that my application queries each time a different coordinator<span=
><br><br>&gt;&gt;what scaleup /
 scaledown are and precise if you let it decommission fully (streams=20
finished, node removed from nodetool status)<br></span>&gt;&gt;&gt; i&#39;m=
 using openstack platform to autoscale cassandra cluster. Actually, in open=
stack, the combination of ceilometer + heat allow to users to automate the =
deployment of their applications and supervise their resources. they can or=
der the scale up (adding of new nodes automatically) when resources(cpu, ra=
m,...) are needed or scaledown (remove unecessary VMs automatically). <br><=
/div><div>so with heat i can spawn automatically a cluster of 2 cassandra V=
Ms (create the cluster and configure each cassandra server with a template)=
. My cluster can go from 2 nodes to 6 based on the workloads. when their is=
 a scaledown action, heat automatically execute a script on my node and dec=
ommission it before removing it.<span><br><br>&gt;&gt;Also, I am not sure o=
f the meaning of this --&gt; &quot;<span style=3D"font-size:12.8px">=C2=A0<=
/span><span style=3D"font-size:12.8px">i&#39;m
 affecting to each of my node a different token based on there ip=20
address (the token is A+B+C+D and the ip is A.B.C.D)&quot;. </span><br><br>=
</span></div>look at this:<br>[root@demo-server-seed-wgevseugyjd7 ~]# nodet=
ool status bng;<span><br>Datacenter: DC1<br>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D<br>Status=3DUp/Down<br>|/ State=3DNormal/Leaving/Joining/Mo=
ving<br>--=C2=A0 Address=C2=A0=C2=A0=C2=A0=C2=A0 Load=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 Tokens=C2=A0 Owns (effective)=C2=A0 Host ID=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 Rack<br></span>UN=C2=A0 40.0.0.149=C2=A0 789.03 KB=C2=A0 18=
9=C2=A0=C2=A0=C2=A0=C2=A0 100.0%=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 bd0b2616-18d9-4bc2-a80b-eebd67474712=C2=A0 RAC1<br=
>UN=C2=A0 40.0.0.168=C2=A0 300.38 KB=C2=A0 208=C2=A0=C2=A0=C2=A0=C2=A0 100.=
0%=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ebd973=
2b-ebfc-4a6c-b354-d7df860b57b0=C2=A0 RAC1<br><br></div>the node with addres=
s 40.0.0.149 have the token 189=3D40+0+0+149<br></div>and the node with add=
ress 40.0.0.168 have the token 208=3D40+0+0+168<br><br></div>this way i&#39=
;m sure that each node in my cluster will have a different token. I don&#39=
;t know what will happen if all the node have the same token??<span><br><br=
>&gt;&gt;<span style=3D"font-size:12.8px">Aren&#39;t you using=20
RandomPartitioner or Murmur3Partitioner <br><br></span></span></div><span s=
tyle=3D"font-size:12.8px">i&#39;m using the default one which is <br>partit=
ioner: org.apache.cassandra.dht.Murmur3Partitioner<br><br></span></div><div=
><span style=3D"font-size:12.8px"><br></span></div><span style=3D"font-size=
:12.8px">in order to</span><span style=3D"font-size:12.8px"> configure cass=
andra</span><span style=3D"font-size:12.8px"> on each node i&#39;m using th=
is script<br><br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inputs:<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 - name: IP<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - name: SEED<b=
r>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 config: |<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 #!/bin/bash -v<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 cat &lt;&lt; EOF &gt;&gt; /etc/resolv.conf<br>=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 nameserver 8.8.8.8<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 nameserver 192.168.5.1<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 EOF<br><br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 DEFAULT=3D${DEFAU=
LT:-/etc/cassandra/default.conf}<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 CONFIG=3D/etc/cassandra/conf<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 IFS=3D&quot;.&quot; read a b c d &lt;&lt;&lt; $IP<br>=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 s=3D&quot;$[a[0]+b[0]+c[0]+d[0]]&quot;<br>=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/^cluster_name.*/clus=
ter_name: &#39;Cassandra cluster for freeradius&#39;/&quot; $CONFIG/cassand=
ra.yaml<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/^nu=
m_tokens.*/num_tokens: $s/&quot; $CONFIG/cassandra.yaml<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/^listen_address.*/listen_addr=
ess: $IP/&quot; $CONFIG/cassandra.yaml<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 sed -i -e &quot;s/^rpc_address.*/rpc_address: <a href=3D"http://0=
.0.0.0/" target=3D"_blank">0.0.0.0/</a>&quot; $CONFIG/cassandra.yaml<br>=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/^broadcast_addres=
s.*/broadcast_address: $IP/&quot; $CONFIG/cassandra.yaml<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/broadcast_rpc_address.*/broad=
cast_rpc_address: $IP/&quot; $CONFIG/cassandra.yaml<br>=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/^commitlog_segment_size_in_mb.*/=
commitlog_segment_size_in_mb: 32/&quot; $CONFIG/cassandra.yaml<br>=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/# JVM_OPTS=3D\&quot;\$J=
VM_OPTS -Djava.rmi.server.hostname=3D&lt;public name&gt;\&quot;/JVM_OPTS=3D=
\&quot;\$JVM_OPTS -Djava.rmi.server.hostname=3D$IP\&quot;/&quot; $CONFIG/ca=
ssandra-env.sh<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quo=
t;s/- seeds.*/- seeds: \&quot;$SEED\&quot;/&quot; $CONFIG/cassandra.yaml<br=
><br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quot;s/^endpoint=
_snitch.*/endpoint_snitch: GossipingPropertyFileSnitch/&quot; $CONFIG/cassa=
ndra.yaml<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 echo MAX_HEAP_SIZE=
=3D&quot;4G&quot; &gt;&gt;=C2=A0 $CONFIG/cassandra-env.sh<br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 echo HEAP_NEWSIZE=3D&quot;800M&quot; &gt;&gt=
; $CONFIG/cassandra-env.sh<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 se=
rvice cassandra stop<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 rm -rf /=
var/lib/cassandra/data/system/*<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 service cassandra start<br><br></span><div><div><br></div></div></div><=
div><div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">2015-09-=
07 16:30 GMT+02:00 Ryan Svihla <span dir=3D"ltr">&lt;<a href=3D"mailto:rs@f=
oundev.pro" target=3D"_blank">rs@foundev.pro</a>&gt;</span>:<br><blockquote=
 class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc soli=
d;padding-left:1ex"><div style=3D"white-space:pre-wrap">If that&#39;s what =
tracing is telling you then it&#39;s fine and just a product of data distri=
bution (note your token count isn&#39;t identical anyway).<br><br>If you=
9;re doing cl one queries directly against particular nodes and getting dif=
ferent results it sounds like dropped mutations, streaming errors and or ti=
meouts. Does running repair or reading at CL level all give you an accurate=
 total record count?<br><br>nodetool tpstats should help post bootstrap ide=
ntify dropped mutations but you also want to monitor logs for any errors (i=
n general this is always good advice for any system).. There could be a myr=
iad or problems with bootstrapping new nodes, usually this is related to un=
der provisioning.</div><div><div><br><div class=3D"gmail_quote"><div dir=3D=
"ltr">On Mon, Sep 7, 2015 at 8:19 AM Alain RODRIGUEZ &lt;<a href=3D"mailto:=
arodrime@gmail.com" target=3D"_blank">arodrime@gmail.com</a>&gt; wrote:<br>=
</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l=
eft:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Hi Sara,</div><d=
iv><br></div>Can you detail actions performed, like how you load data, what=
 scaleup / scaledown are and precise if you let it decommission fully (stre=
ams finished, node removed from nodetool status) etc ?<div><br></div><div>T=
his would help us to help you :).</div><div><br></div><div>Also, what happe=
ns if you query using &quot;CONSISTENCY LOCAL_QUORUM;&quot; (or ALL) before=
 your select ? If not using cqlsh, set the Consistency Level of your client=
 to LOCAL_QUORUM or ALL and try to select again.</div><div><br></div><div>A=
lso, I am not sure of the meaning of this --&gt; &quot;<span style=3D"font-=
size:12.8px">=C2=A0</span><span style=3D"font-size:12.8px">i&#39;m affectin=
g to each of my node a different token based on there ip address (the token=
 is A+B+C+D and the ip is A.B.C.D)&quot;. Aren&#39;t you using RandomPartit=
ioner or Murmur3Partitioner ?</span></div><div><br></div><div>C*heers,</div=
><div><br></div><div>Alain</div></div><div dir=3D"ltr"><div><br></div><div>=
<br></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">2015-09=
-07 12:01 GMT+02:00 Edouard COLE <span dir=3D"ltr">&lt;<a href=3D"mailto:Ed=
ouard.COLE@rgsystem.com" target=3D"_blank">Edouard.COLE@rgsystem.com</a>&gt=
;</span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-=
style:solid;padding-left:1ex">Please, don&#39;t mail me directly<br>
<br>
I read your answer, but I cannot help anymore<br>
<br>
And answering with &quot;Sorry, I can&#39;t help&quot; is pointless :)<br>
<br>
Wait for the community to answer<br>
<br>
De=C2=A0: ICHIBA Sara [mailto:<a href=3D"mailto:ichi.sara@gmail.com" target=
=3D"_blank">ichi.sara@gmail.com</a>]<br>
Envoy=C3=A9=C2=A0: Monday, September 07, 2015 11:34 AM<br>
=C3=80=C2=A0: <a href=3D"mailto:user@cassandra.apache.org" target=3D"_blank=
">user@cassandra.apache.org</a><br>
Objet=C2=A0: Re: cassandra scalability<br>
<div><div><br>
when there&#39;s a scaledown action, i make sure to decommission the node b=
efore. but still, I don&#39;t understand why I&#39;m having this behaviour.=
 is it normal. what do you do normally to remove a node? is it related to t=
okens? i&#39;m affecting to each of my node a different token based on ther=
e ip address (the token is A+B+C+D and the ip is A.B.C.D)<br>
<br>
2015-09-07 11:28 GMT+02:00 ICHIBA Sara &lt;<a href=3D"mailto:ichi.sara@gmai=
l.com" target=3D"_blank">ichi.sara@gmail.com</a>&gt;:<br>
at the biginning it looks like this:<br>
<br>
[root@demo-server-seed-k6g62qr57nok ~]# nodetool status<br>
Datacenter: DC1<br>
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>
Status=3DUp/Down<br>
|/ State=3DNormal/Leaving/Joining/Moving<br>
--=C2=A0 Address=C2=A0=C2=A0=C2=A0=C2=A0 Load=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 Tokens=C2=A0 Owns=C2=A0=C2=A0=C2=A0 Host ID=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 Rack<br>
UN=C2=A0 40.0.0.208=C2=A0 128.73 KB=C2=A0 248=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 6e7788f9-56bf-4314-a23a-3bf1642d0606=C2=
=A0 RAC1<br>
UN=C2=A0 40.0.0.209=C2=A0 114.59 KB=C2=A0 249=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 84f6f0be-6633-4c36-b341-b968ff91a58f=C2=
=A0 RAC1<br>
UN=C2=A0 40.0.0.205=C2=A0 129.53 KB=C2=A0 245=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 aa233dc2-a8ae-4c00-af74-0a119825237f=C2=
=A0 RAC1<br>
<br>
<br>
<br>
<br>
[root@demo-server-seed-k6g62qr57nok ~]# nodetool status service_dictionary<=
br>
Datacenter: DC1<br>
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>
Status=3DUp/Down<br>
|/ State=3DNormal/Leaving/Joining/Moving<br>
--=C2=A0 Address=C2=A0=C2=A0=C2=A0=C2=A0 Load=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 Tokens=C2=A0 Owns (effective)=C2=A0 Host ID=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 Rack<br>
UN=C2=A0 40.0.0.208=C2=A0 128.73 KB=C2=A0 248=C2=A0=C2=A0=C2=A0=C2=A0 68.8%=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 6e=
7788f9-56bf-4314-a23a-3bf1642d0606=C2=A0 RAC1<br>
UN=C2=A0 40.0.0.209=C2=A0 114.59 KB=C2=A0 249=C2=A0=C2=A0=C2=A0=C2=A0 67.8%=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 84=
f6f0be-6633-4c36-b341-b968ff91a58f=C2=A0 RAC1<br>
UN=C2=A0 40.0.0.205=C2=A0 129.53 KB=C2=A0 245=C2=A0=C2=A0=C2=A0=C2=A0 63.5%=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 aa=
233dc2-a8ae-4c00-af74-0a119825237f=C2=A0 RAC1<br>
<br>
the result of the query select * from service_dictionary.table1; gave me<br=
>
=C2=A070 rows from 40.0.0.205<br>
64 from 40.0.0.209<br>
54 from 40.0.0.208<br>
<br>
2015-09-07 11:13 GMT+02:00 Edouard COLE &lt;<a href=3D"mailto:Edouard.COLE@=
rgsystem.com" target=3D"_blank">Edouard.COLE@rgsystem.com</a>&gt;:<br>
Could you provide the result of :<br>
- nodetool status<br>
- nodetool status YOURKEYSPACE<br>
<br>
<br>
</div></div></blockquote></div><br></div></div></blockquote></div></div></d=
iv><span><font color=3D"#888888"><div dir=3D"ltr">-- <br></div>Regards,<br>=
<br>Ryan Svihla
</font></span></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--089e011832969c6846051f2ab9a5--