Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 576AC18084 for ; Mon, 7 Sep 2015 16:27:44 +0000 (UTC) Received: (qmail 28596 invoked by uid 500); 7 Sep 2015 16:27:40 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 28553 invoked by uid 500); 7 Sep 2015 16:27:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 28542 invoked by uid 99); 7 Sep 2015 16:27:40 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Sep 2015 16:27:40 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 90F7CC1E18 for ; Mon, 7 Sep 2015 16:27:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 5.901 X-Spam-Level: ***** X-Spam-Status: No, score=5.901 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=3, KAM_BADIPHTTP=2, NORMAL_HTTP_TO_IP=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id euaNoc_nKXyi for ; Mon, 7 Sep 2015 16:27:28 +0000 (UTC) Received: from mail-la0-f53.google.com (mail-la0-f53.google.com [209.85.215.53]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 4EB1F204DB for ; Mon, 7 Sep 2015 16:27:27 +0000 (UTC) Received: by laeb10 with SMTP id b10so55189523lae.1 for ; Mon, 07 Sep 2015 09:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=bIOK4YP3TW5MRzwD7WniUtpX+1DlxYfaWMhkA9xdWAc=; b=v608CPoxjBGk4z+2YYn+AHeneiDkxA59EX0x5ZFIWljvdtH1ILm69M2YJnGpngbDH7 69rrib7G1U34HiYfqwy6OwJFAdiNNiwOxJBFQHLz7LgyRG6oc8Siv82lKNkvvyhXNz2P LBxa+3GhxBk2naInxA0t8YFbK+CPZ/Pry8Y8IWwlYSbThOnBZ0niFiFy4IMuTi1PPXxw 5/6XGIQIYDfqNG6UZkQkXTmEtSZ3Pa7cOK7mympcpH2thr6YXWOuwV2tp0ilFLxRr7PC COXv6UiFRdWihYWZlxwnX6FB9gKqL4mOHYB4dLdobT2khsWCD0AZyPc7pGnIr0oHMeLd MQzQ== X-Received: by 10.112.163.102 with SMTP id yh6mr18002045lbb.54.1441643246674; Mon, 07 Sep 2015 09:27:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.208.104 with HTTP; Mon, 7 Sep 2015 09:27:07 -0700 (PDT) In-Reply-To: References: <38E0A7FA71EE504B8D561B490E87E4EB5C9298B6@SR8-EX10-N1.intra.genapi.fr> <38E0A7FA71EE504B8D561B490E87E4EB5C929CE7@SR8-EX10-N1.intra.genapi.fr> From: Alain RODRIGUEZ Date: Mon, 7 Sep 2015 18:27:07 +0200 Message-ID: Subject: Re: cassandra scalability To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e011832969c6846051f2ab9a5 --089e011832969c6846051f2ab9a5 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Glad to hear that. About the Proxy Putting an HA proxy in front of Cassandra is an anti-pattern as you create a single point of failure. You can multiply them, but why would you do that anyway when most (all ?) of the modern Cassandra clients handle this for you through TokenAware / RoundRobin / LatencyAware strategies ? About the tokens: You can use vnodes (set "num_tokens" to 64 or 128 let's say, comment "#initial_token") and let Cassandra handle node/tokens distribution combined to a Murmur3Partitioner and you're all set. Not sure what you're doing with IPs, still don't get it but it is either useless or even bad from what I understand. Finally (as you find out) you can't auto_bootstrap a seed node indeed. Hope this help. C*heers, Alain 2015-09-07 18:11 GMT+02:00 ICHIBA Sara : > I think I know where my problem is coming from. I took a look at the log > of cassandra on each node and I saw something related to bootstrap. it sa= ys > that the node is a seed so there will be no bootstraping. Actually I made= a > mistake. in the cassandra.yaml file each node have two ips as seeds. the = ip > of the machine itself and the ip of the real seed server. Once i removed > the local IP the problem seems to be fixed. > > 2015-09-07 18:01 GMT+02:00 ICHIBA Sara : > >> Thank you all for your answers. >> >> @Alain: >> Can you detail actions performed, >> >>like how you load data >> >>>i have a haproxy in front of my cassandra database, so i'm sure that >> my application queries each time a different coordinator >> >> >>what scaleup / scaledown are and precise if you let it decommission >> fully (streams finished, node removed from nodetool status) >> >>> i'm using openstack platform to autoscale cassandra cluster. >> Actually, in openstack, the combination of ceilometer + heat allow to us= ers >> to automate the deployment of their applications and supervise their >> resources. they can order the scale up (adding of new nodes automaticall= y) >> when resources(cpu, ram,...) are needed or scaledown (remove unecessary = VMs >> automatically). >> so with heat i can spawn automatically a cluster of 2 cassandra VMs >> (create the cluster and configure each cassandra server with a template)= . >> My cluster can go from 2 nodes to 6 based on the workloads. when their i= s a >> scaledown action, heat automatically execute a script on my node and >> decommission it before removing it. >> >> >>Also, I am not sure of the meaning of this --> " i'm affecting to each >> of my node a different token based on there ip address (the token is >> A+B+C+D and the ip is A.B.C.D)". >> >> look at this: >> [root@demo-server-seed-wgevseugyjd7 ~]# nodetool status bng; >> Datacenter: DC1 >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> Status=3DUp/Down >> |/ State=3DNormal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host >> ID Rack >> UN 40.0.0.149 789.03 KB 189 100.0% >> bd0b2616-18d9-4bc2-a80b-eebd67474712 RAC1 >> UN 40.0.0.168 300.38 KB 208 100.0% >> ebd9732b-ebfc-4a6c-b354-d7df860b57b0 RAC1 >> >> the node with address 40.0.0.149 have the token 189=3D40+0+0+149 >> and the node with address 40.0.0.168 have the token 208=3D40+0+0+168 >> >> this way i'm sure that each node in my cluster will have a different >> token. I don't know what will happen if all the node have the same token= ?? >> >> >>Aren't you using RandomPartitioner or Murmur3Partitioner >> >> i'm using the default one which is >> partitioner: org.apache.cassandra.dht.Murmur3Partitioner >> >> >> in order to configure cassandra on each node i'm using this script >> >> inputs: >> - name: IP >> - name: SEED >> config: | >> #!/bin/bash -v >> cat << EOF >> /etc/resolv.conf >> nameserver 8.8.8.8 >> nameserver 192.168.5.1 >> EOF >> >> DEFAULT=3D${DEFAULT:-/etc/cassandra/default.conf} >> CONFIG=3D/etc/cassandra/conf >> IFS=3D"." read a b c d <<< $IP >> s=3D"$[a[0]+b[0]+c[0]+d[0]]" >> sed -i -e "s/^cluster_name.*/cluster_name: 'Cassandra cluster fo= r >> freeradius'/" $CONFIG/cassandra.yaml >> sed -i -e "s/^num_tokens.*/num_tokens: $s/" $CONFIG/cassandra.ya= ml >> sed -i -e "s/^listen_address.*/listen_address: $IP/" >> $CONFIG/cassandra.yaml >> sed -i -e "s/^rpc_address.*/rpc_address: 0.0.0.0/" >> $CONFIG/cassandra.yaml >> sed -i -e "s/^broadcast_address.*/broadcast_address: $IP/" >> $CONFIG/cassandra.yaml >> sed -i -e "s/broadcast_rpc_address.*/broadcast_rpc_address: $IP/= " >> $CONFIG/cassandra.yaml >> sed -i -e >> "s/^commitlog_segment_size_in_mb.*/commitlog_segment_size_in_mb: 32/" >> $CONFIG/cassandra.yaml >> sed -i -e "s/# JVM_OPTS=3D\"\$JVM_OPTS >> -Djava.rmi.server.hostname=3D\"/JVM_OPTS=3D\"\$JVM_OPTS >> -Djava.rmi.server.hostname=3D$IP\"/" $CONFIG/cassandra-env.sh >> sed -i -e "s/- seeds.*/- seeds: \"$SEED\"/" $CONFIG/cassandra.ya= ml >> >> sed -i -e "s/^endpoint_snitch.*/endpoint_snitch: >> GossipingPropertyFileSnitch/" $CONFIG/cassandra.yaml >> echo MAX_HEAP_SIZE=3D"4G" >> $CONFIG/cassandra-env.sh >> echo HEAP_NEWSIZE=3D"800M" >> $CONFIG/cassandra-env.sh >> service cassandra stop >> rm -rf /var/lib/cassandra/data/system/* >> service cassandra start >> >> >> >> 2015-09-07 16:30 GMT+02:00 Ryan Svihla : >> >>> If that's what tracing is telling you then it's fine and just a product >>> of data distribution (note your token count isn't identical anyway). >>> >>> If you're doing cl one queries directly against particular nodes and >>> getting different results it sounds like dropped mutations, streaming >>> errors and or timeouts. Does running repair or reading at CL level all = give >>> you an accurate total record count? >>> >>> nodetool tpstats should help post bootstrap identify dropped mutations >>> but you also want to monitor logs for any errors (in general this is al= ways >>> good advice for any system).. There could be a myriad or problems with >>> bootstrapping new nodes, usually this is related to under provisioning. >>> >>> On Mon, Sep 7, 2015 at 8:19 AM Alain RODRIGUEZ >>> wrote: >>> >>>> Hi Sara, >>>> >>>> Can you detail actions performed, like how you load data, what scaleup >>>> / scaledown are and precise if you let it decommission fully (streams >>>> finished, node removed from nodetool status) etc ? >>>> >>>> This would help us to help you :). >>>> >>>> Also, what happens if you query using "CONSISTENCY LOCAL_QUORUM;" (or >>>> ALL) before your select ? If not using cqlsh, set the Consistency Leve= l of >>>> your client to LOCAL_QUORUM or ALL and try to select again. >>>> >>>> Also, I am not sure of the meaning of this --> " i'm affecting to each >>>> of my node a different token based on there ip address (the token is >>>> A+B+C+D and the ip is A.B.C.D)". Aren't you using RandomPartitioner or >>>> Murmur3Partitioner ? >>>> >>>> C*heers, >>>> >>>> Alain >>>> >>>> >>>> >>>> 2015-09-07 12:01 GMT+02:00 Edouard COLE : >>>> >>>>> Please, don't mail me directly >>>>> >>>>> I read your answer, but I cannot help anymore >>>>> >>>>> And answering with "Sorry, I can't help" is pointless :) >>>>> >>>>> Wait for the community to answer >>>>> >>>>> De : ICHIBA Sara [mailto:ichi.sara@gmail.com] >>>>> Envoy=C3=A9 : Monday, September 07, 2015 11:34 AM >>>>> =C3=80 : user@cassandra.apache.org >>>>> Objet : Re: cassandra scalability >>>>> >>>>> when there's a scaledown action, i make sure to decommission the node >>>>> before. but still, I don't understand why I'm having this behaviour. = is it >>>>> normal. what do you do normally to remove a node? is it related to to= kens? >>>>> i'm affecting to each of my node a different token based on there ip >>>>> address (the token is A+B+C+D and the ip is A.B.C.D) >>>>> >>>>> 2015-09-07 11:28 GMT+02:00 ICHIBA Sara : >>>>> at the biginning it looks like this: >>>>> >>>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status >>>>> Datacenter: DC1 >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> Status=3DUp/Down >>>>> |/ State=3DNormal/Leaving/Joining/Moving >>>>> -- Address Load Tokens Owns Host >>>>> ID Rack >>>>> UN 40.0.0.208 128.73 KB 248 ? >>>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606 RAC1 >>>>> UN 40.0.0.209 114.59 KB 249 ? >>>>> 84f6f0be-6633-4c36-b341-b968ff91a58f RAC1 >>>>> UN 40.0.0.205 129.53 KB 245 ? >>>>> aa233dc2-a8ae-4c00-af74-0a119825237f RAC1 >>>>> >>>>> >>>>> >>>>> >>>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status >>>>> service_dictionary >>>>> Datacenter: DC1 >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> Status=3DUp/Down >>>>> |/ State=3DNormal/Leaving/Joining/Moving >>>>> -- Address Load Tokens Owns (effective) Host >>>>> ID Rack >>>>> UN 40.0.0.208 128.73 KB 248 68.8% >>>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606 RAC1 >>>>> UN 40.0.0.209 114.59 KB 249 67.8% >>>>> 84f6f0be-6633-4c36-b341-b968ff91a58f RAC1 >>>>> UN 40.0.0.205 129.53 KB 245 63.5% >>>>> aa233dc2-a8ae-4c00-af74-0a119825237f RAC1 >>>>> >>>>> the result of the query select * from service_dictionary.table1; gave >>>>> me >>>>> 70 rows from 40.0.0.205 >>>>> 64 from 40.0.0.209 >>>>> 54 from 40.0.0.208 >>>>> >>>>> 2015-09-07 11:13 GMT+02:00 Edouard COLE : >>>>> Could you provide the result of : >>>>> - nodetool status >>>>> - nodetool status YOURKEYSPACE >>>>> >>>>> >>>>> >>>> -- >>> Regards, >>> >>> Ryan Svihla >> >> >> > --089e011832969c6846051f2ab9a5 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Glad to hear that.

About the Proxy
<= div>
Putting an HA proxy in front of Cassandra is an anti-pat= tern as you create a single point of failure. You can multiply them, but wh= y would you do that anyway when most (all ?) of the modern Cassandra client= s handle this for you through TokenAware / =C2=A0RoundRobin / LatencyAware = strategies ?

About the tokens:

You can use vnodes (set "num_to= kens" to 64 or 128 let's say, comment "#initial_token")<= /span>=C2=A0and let Cassandra handle node/tokens distribution combined to a= Murmur3Partitioner and you're all set. Not sure what you're doing = with IPs, still don't get it but it is either useless or even bad from = what I understand.

Finally (as you find out) you c= an't auto_bootstrap a seed node indeed.

Hope t= his help.

C*heers,

Alain<= /div>


2015-09-07 18:11 GMT+02:00 ICHIBA Sara &= lt;ichi.sara@gmail= .com>:
I t= hink I know where my problem is coming from. I took a look at the log of ca= ssandra on each node and I saw something related to bootstrap. it says that= the node is a seed so there will be no bootstraping. Actually I made a mis= take. in the cassandra.yaml file each node have two ips as seeds. the ip of= the machine itself and the ip of the real seed server. Once i removed the = local IP the problem seems to be fixed.

201= 5-09-07 18:01 GMT+02:00 ICHIBA Sara <ichi.sara@gmail.com>:=
Thank you all for your answers.

@Alain:
= Can you detail actions performed,
>>like how you load data
>>>i have a haproxy in front of my cassandra database, so i= 9;m sure that my application queries each time a different coordinator

>>what scaleup / scaledown are and precise if you let it decommission fully (streams=20 finished, node removed from nodetool status)
>>> i'm= using openstack platform to autoscale cassandra cluster. Actually, in open= stack, the combination of ceilometer + heat allow to users to automate the = deployment of their applications and supervise their resources. they can or= der the scale up (adding of new nodes automatically) when resources(cpu, ra= m,...) are needed or scaledown (remove unecessary VMs automatically).
<= /div>
so with heat i can spawn automatically a cluster of 2 cassandra V= Ms (create the cluster and configure each cassandra server with a template)= . My cluster can go from 2 nodes to 6 based on the workloads. when their is= a scaledown action, heat automatically execute a script on my node and dec= ommission it before removing it.

>>Also, I am not sure o= f the meaning of this --> "=C2=A0<= /span>i'm affecting to each of my node a different token based on there ip=20 address (the token is A+B+C+D and the ip is A.B.C.D)".

=
look at this:
[root@demo-server-seed-wgevseugyjd7 ~]# nodet= ool status bng;
Datacenter: DC1
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D
Status=3DUp/Down
|/ State=3DNormal/Leaving/Joining/Mo= ving
--=C2=A0 Address=C2=A0=C2=A0=C2=A0=C2=A0 Load=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 Tokens=C2=A0 Owns (effective)=C2=A0 Host ID=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 Rack
UN=C2=A0 40.0.0.149=C2=A0 789.03 KB=C2=A0 18= 9=C2=A0=C2=A0=C2=A0=C2=A0 100.0%=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 bd0b2616-18d9-4bc2-a80b-eebd67474712=C2=A0 RAC1UN=C2=A0 40.0.0.168=C2=A0 300.38 KB=C2=A0 208=C2=A0=C2=A0=C2=A0=C2=A0 100.= 0%=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ebd973= 2b-ebfc-4a6c-b354-d7df860b57b0=C2=A0 RAC1

the node with addres= s 40.0.0.149 have the token 189=3D40+0+0+149
and the node with add= ress 40.0.0.168 have the token 208=3D40+0+0+168

this way i'= ;m sure that each node in my cluster will have a different token. I don'= ;t know what will happen if all the node have the same token??
>>Aren't you using=20 RandomPartitioner or Murmur3Partitioner

i'm using the default one which is
partit= ioner: org.apache.cassandra.dht.Murmur3Partitioner


in order to configure cass= andra on each node i'm using th= is script

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inputs:
=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 - name: IP
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - name: SEED=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 config: |
=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 #!/bin/bash -v
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 cat << EOF >> /etc/resolv.conf
=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 nameserver 8.8.8.8
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 nameserver 192.168.5.1
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 EOF

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 DEFAULT=3D${DEFAU= LT:-/etc/cassandra/default.conf}
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 CONFIG=3D/etc/cassandra/conf
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 IFS=3D"." read a b c d <<< $IP
=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 s=3D"$[a[0]+b[0]+c[0]+d[0]]"
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/^cluster_name.*/clus= ter_name: 'Cassandra cluster for freeradius'/" $CONFIG/cassand= ra.yaml
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/^nu= m_tokens.*/num_tokens: $s/" $CONFIG/cassandra.yaml
=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/^listen_address.*/listen_addr= ess: $IP/" $CONFIG/cassandra.yaml
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 sed -i -e "s/^rpc_address.*/rpc_address: 0.0.0.0/" $CONFIG/cassandra.yaml
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/^broadcast_addres= s.*/broadcast_address: $IP/" $CONFIG/cassandra.yaml
=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/broadcast_rpc_address.*/broad= cast_rpc_address: $IP/" $CONFIG/cassandra.yaml
=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/^commitlog_segment_size_in_mb.*/= commitlog_segment_size_in_mb: 32/" $CONFIG/cassandra.yaml
=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/# JVM_OPTS=3D\"\$J= VM_OPTS -Djava.rmi.server.hostname=3D<public name>\"/JVM_OPTS=3D= \"\$JVM_OPTS -Djava.rmi.server.hostname=3D$IP\"/" $CONFIG/ca= ssandra-env.sh
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e &quo= t;s/- seeds.*/- seeds: \"$SEED\"/" $CONFIG/cassandra.yaml
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sed -i -e "s/^endpoint= _snitch.*/endpoint_snitch: GossipingPropertyFileSnitch/" $CONFIG/cassa= ndra.yaml
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 echo MAX_HEAP_SIZE= =3D"4G" >>=C2=A0 $CONFIG/cassandra-env.sh
=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 echo HEAP_NEWSIZE=3D"800M" >>= ; $CONFIG/cassandra-env.sh
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 se= rvice cassandra stop
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 rm -rf /= var/lib/cassandra/data/system/*
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 service cassandra start


<= div>

2015-09-= 07 16:30 GMT+02:00 Ryan Svihla <rs@foundev.pro>:
If that's what = tracing is telling you then it's fine and just a product of data distri= bution (note your token count isn't identical anyway).

If you= 9;re doing cl one queries directly against particular nodes and getting dif= ferent results it sounds like dropped mutations, streaming errors and or ti= meouts. Does running repair or reading at CL level all give you an accurate= total record count?

nodetool tpstats should help post bootstrap ide= ntify dropped mutations but you also want to monitor logs for any errors (i= n general this is always good advice for any system).. There could be a myr= iad or problems with bootstrapping new nodes, usually this is related to un= der provisioning.

On Mon, Sep 7, 2015 at 8:19 AM Alain RODRIGUEZ <arodrime@gmail.com> wrote:
=
Hi Sara,

Can you detail actions performed, like how you load data, what= scaleup / scaledown are and precise if you let it decommission fully (stre= ams finished, node removed from nodetool status) etc ?

T= his would help us to help you :).

Also, what happe= ns if you query using "CONSISTENCY LOCAL_QUORUM;" (or ALL) before= your select ? If not using cqlsh, set the Consistency Level of your client= to LOCAL_QUORUM or ALL and try to select again.

A= lso, I am not sure of the meaning of this --> "=C2=A0i'm affectin= g to each of my node a different token based on there ip address (the token= is A+B+C+D and the ip is A.B.C.D)". Aren't you using RandomPartit= ioner or Murmur3Partitioner ?

C*heers,

Alain

=

2015-09= -07 12:01 GMT+02:00 Edouard COLE <Edouard.COLE@rgsystem.com>= ;:
Please, don't mail me directly

I read your answer, but I cannot help anymore

And answering with "Sorry, I can't help" is pointless :)

Wait for the community to answer

De=C2=A0: ICHIBA Sara [mailto:ichi.sara@gmail.com]
Envoy=C3=A9=C2=A0: Monday, September 07, 2015 11:34 AM
=C3=80=C2=A0: user@cassandra.apache.org
Objet=C2=A0: Re: cassandra scalability

when there's a scaledown action, i make sure to decommission the node b= efore. but still, I don't understand why I'm having this behaviour.= is it normal. what do you do normally to remove a node? is it related to t= okens? i'm affecting to each of my node a different token based on ther= e ip address (the token is A+B+C+D and the ip is A.B.C.D)

2015-09-07 11:28 GMT+02:00 ICHIBA Sara <ichi.sara@gmail.com>:
at the biginning it looks like this:

[root@demo-server-seed-k6g62qr57nok ~]# nodetool status
Datacenter: DC1
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Status=3DUp/Down
|/ State=3DNormal/Leaving/Joining/Moving
--=C2=A0 Address=C2=A0=C2=A0=C2=A0=C2=A0 Load=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 Tokens=C2=A0 Owns=C2=A0=C2=A0=C2=A0 Host ID=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 Rack
UN=C2=A0 40.0.0.208=C2=A0 128.73 KB=C2=A0 248=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 6e7788f9-56bf-4314-a23a-3bf1642d0606=C2= =A0 RAC1
UN=C2=A0 40.0.0.209=C2=A0 114.59 KB=C2=A0 249=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 84f6f0be-6633-4c36-b341-b968ff91a58f=C2= =A0 RAC1
UN=C2=A0 40.0.0.205=C2=A0 129.53 KB=C2=A0 245=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 aa233dc2-a8ae-4c00-af74-0a119825237f=C2= =A0 RAC1




[root@demo-server-seed-k6g62qr57nok ~]# nodetool status service_dictionary<= br> Datacenter: DC1
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Status=3DUp/Down
|/ State=3DNormal/Leaving/Joining/Moving
--=C2=A0 Address=C2=A0=C2=A0=C2=A0=C2=A0 Load=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 Tokens=C2=A0 Owns (effective)=C2=A0 Host ID=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 Rack
UN=C2=A0 40.0.0.208=C2=A0 128.73 KB=C2=A0 248=C2=A0=C2=A0=C2=A0=C2=A0 68.8%= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 6e= 7788f9-56bf-4314-a23a-3bf1642d0606=C2=A0 RAC1
UN=C2=A0 40.0.0.209=C2=A0 114.59 KB=C2=A0 249=C2=A0=C2=A0=C2=A0=C2=A0 67.8%= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 84= f6f0be-6633-4c36-b341-b968ff91a58f=C2=A0 RAC1
UN=C2=A0 40.0.0.205=C2=A0 129.53 KB=C2=A0 245=C2=A0=C2=A0=C2=A0=C2=A0 63.5%= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 aa= 233dc2-a8ae-4c00-af74-0a119825237f=C2=A0 RAC1

the result of the query select * from service_dictionary.table1; gave me =C2=A070 rows from 40.0.0.205
64 from 40.0.0.209
54 from 40.0.0.208

2015-09-07 11:13 GMT+02:00 Edouard COLE <Edouard.COLE@rgsystem.com>:
Could you provide the result of :
- nodetool status
- nodetool status YOURKEYSPACE



--
Regards,
=
Ryan Svihla



--089e011832969c6846051f2ab9a5--