From user-return-33280-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Apr 8 16:54:02 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6B940F5B2 for ; Mon, 8 Apr 2013 16:54:02 +0000 (UTC) Received: (qmail 1965 invoked by uid 500); 8 Apr 2013 16:53:59 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 1890 invoked by uid 500); 8 Apr 2013 16:53:59 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 1880 invoked by uid 99); 8 Apr 2013 16:53:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Apr 2013 16:53:59 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of asf11@outlook.com designates 65.54.190.145 as permitted sender) Received: from [65.54.190.145] (HELO bay0-omc3-s7.bay0.hotmail.com) (65.54.190.145) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Apr 2013 16:53:53 +0000 Received: from BAY176-W14 ([65.54.190.188]) by bay0-omc3-s7.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 8 Apr 2013 09:53:31 -0700 X-EIP: [Qc58jN/eZ6POdvS1r3t3oCmdQVsyZCO4] X-Originating-Email: [asf11@outlook.com] Message-ID: Content-Type: multipart/alternative; boundary="_d3617236-89b3-48a6-9616-19a9d69208fe_" From: S C To: "user@cassandra.apache.org" Subject: RE: gossip not working Date: Mon, 8 Apr 2013 11:53:31 -0500 Importance: Normal In-Reply-To: <06CA8644-5CF6-4011-AA36-B7653A545204@thelastpickle.com> References: ,<06CA8644-5CF6-4011-AA36-B7653A545204@thelastpickle.com> MIME-Version: 1.0 X-OriginalArrivalTime: 08 Apr 2013 16:53:31.0279 (UTC) FILETIME=[999689F0:01CE3479] X-Virus-Checked: Checked by ClamAV on apache.org --_d3617236-89b3-48a6-9616-19a9d69208fe_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I did try this option and everything is working fine. Thank you Aaron. From: aaron@thelastpickle.com Subject: Re: gossip not working Date: Fri=2C 5 Apr 2013 23:02:58 +0530 To: user@cassandra.apache.org Starting the node with the JVM option -Dcassandra.load_ring_state=3Dfalse i= n cassandra-env.sh sometimes works.=20 If not post the output from nodetool gossipinfo Cheers =0A= -----------------Aaron MortonFreelance Cassandra ConsultantNew Zealand @aaronmortonhttp://www.thelastpickle.com=0A= =0A= =0A= On 5/04/2013=2C at 9:38 AM=2C S C wrote:Is there a way = to force gossip among the nodes? From: asf11@outlook.com To: user@cassandra.apache.org Subject: RE: gossip not working Date: Thu=2C 4 Apr 2013 19:59:45 -0500 I am not seeing anything in the logs other than "Starting up server gossip"= and there is no firewall between the nodes. From: paulsudol@gmail.com Subject: Re: gossip not working Date: Thu=2C 4 Apr 2013 18:49:29 -0500 To: user@cassandra.apache.org What errors are you seeing in the log files of the down nodes? Did you run = upgradesstables? You need to upgradesstables when moving from < 1.1.7 to 1.= 1.9 On Apr 4=2C 2013=2C at 6:11 PM=2C S C wrote:I was in th= e middle of upgrade to 1.1.9. I brought one node with 1.1.9 while the other= were running on 1.1.5. Once one of the node was on 1.1.9 it is no longer r= ecognizing other nodes in the ring. On 192.168.56.10 and 11 192.168.56.10 DC1-Cass RAC1 Up Normal 28.06 GB 50.00= % 0 192.168.56.11 D= C1-Cass RAC1 Up Normal 31.59 GB 25.00% 4= 2535295865117307932921825928971026432 192.168.56.12 DC1-Cass RAC1 = Down Normal 29.02 GB 25.00% 85070591730234615= 865843651857942052864 =20 On 192.168.56.12 192.168.56.10 DC1-Cass RAC1 Down Normal 28.06 GB 50.= 00% 0 192.168.56.11 = DC1-Cass RAC1 Down Normal 31.59 GB 25.00% = 42535295865117307932921825928971026432 192.168.56.12 DC1-Cass R= AC1 Up Normal 29.02 GB 25.00% 850705917302346= 15865843651857942052864 =20 I do not see anything in the logs that tells me that there is a gossip issu= e. nodetool infoToken : 85070591730234615865843651857942052864Gossi= p active : trueThrift active : trueLoad : 29.05 GBGenerat= ion No : 1365114563Uptime (seconds) : 2127Heap Memory (MB) : 848.71 / 79= 45.94Exceptions : 0Key Cache : size 2208 (bytes)=2C capacity 1= 04857584 (bytes)=2C 1056 hits=2C 1099 requests=2C 0.961 recent hit rate=2C = 14400 save period in secondsRow Cache : size 0 (bytes)=2C capacity 0= (bytes)=2C 0 hits=2C 0 requests=2C NaN recent hit rate=2C 0 save period in= seconds nodetool infoToken : 42535295865117307932921825928971026432Gossi= p active : trueThrift active : trueLoad : 31.59 GBGenerat= ion No : 1364413038Uptime (seconds) : 703904Heap Memory (MB) : 733.02 / = 7945.94Exceptions : 1Key Cache : size 3693312 (bytes)=2C capac= ity 104857584 (bytes)=2C 26071678 hits=2C 26616282 requests=2C 0.980 recent= hit rate=2C 14400 save period in secondsRow Cache : size 0 (bytes)= =2C capacity 0 (bytes)=2C 0 hits=2C 0 requests=2C NaN recent hit rate=2C 0 = save period in seconds There is no firewall between the nodes and I can reach each other on storag= e port. What else should I be looking at to find root cause? Appreciate you= r inputs. = --_d3617236-89b3-48a6-9616-19a9d69208fe_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
I did try this option and everyt= hing is working fine. Thank you Aaron.


From: aaron@thelastpickle.comSubject: Re: gossip not working
Date: Fri=2C 5 Apr 2013 23:02:58 +0530=
To: user@cassandra.apache.org

Starting the node with the JVM opt= ion -Dcassandra.load_ring_state=3Dfalse in cassandra-env.sh sometimes works= . =3B

If not post the output from nodetool gossipinf= o

Cheers

=0A=
-----------------
=
Aaron Morton
Freelance Cassandra Consultant
New Ze= aland

@aaronmorton
=0A=
=0A= =0A=
On 5/04/2013=2C at 9:38 AM=2C S C <=3Basf11@outlook.com>=3B wrote:

Is ther= e a way to force gossip among the nodes?


From: asf11@outlook.com
To: user@cassandra.apache.org
Subject: RE: gossip not worki= ng
Date: Thu=2C 4 Apr 2013 19:59:45 -0500

I am n= ot seeing anything in the logs other than "Starting up server gossip" = =3B
and there is no firewall between the nodes.

From: paulsudol@gmail.com
Subject: Re: gossip not w= orking
Date: Thu=2C 4 Apr 2013 18:49:29 -0500
To: user@cassandra.apache.org

What errors = are you seeing in the log files of the down nodes? Did you run upgradesstab= les? You need to upgradesstables when moving from <=3B 1.1.7 to 1.1.9
On Apr 4=2C 2013=2C at 6:11 PM=2C S C <=3Basf11@outlook.com>=3B wrote:

I was in the middle of upgrade to 1.1.9. I brought one node with 1.1.9 whi= le the other were running on 1.1.5. Once one of the node was on 1.1.9 it is= no longer recognizing other nodes in the ring.

On 192.168.56.10 and 11

192.168.56.10  =3BDC= 1-Cass  =3B  =3BRAC1  =3B  =3B  =3B  =3BUp  =3B=  =3B Normal  =3B28.06 GB  =3B  =3B  =3B  =3B50.00%=  =3B  =3B  =3B  =3B  =3B  =3B  =3B0  =3B &= nbsp=3B  =3B  =3B  =3B  =3B  =3B  =3B  =3B &nbs= p=3B  =3B  =3B  =3B  =3B  =3B  =3B  =3B  = =3B  =3B  =3B  =3B =3B
192.168.56.11  =3BDC1-= Cass  =3B  =3BRAC1  =3B  =3B  =3B  =3BUp  =3B &= nbsp=3B Normal  =3B31.59 GB  =3B  =3B  =3B  =3B25.00% &= nbsp=3B  =3B  =3B  =3B  =3B  =3B  =3B42535295865117= 307932921825928971026432  =3B  =3B  =3B
192.168.56.12=  =3BDC1-Cass  =3B  =3BRAC1  =3B  =3B  =3B  =3B= Down  =3B Normal  =3B29.02 GB  =3B  =3B  =3B  =3B25= .00%  =3B  =3B  =3B  =3B  =3B  =3B  =3B85070591= 730234615865843651857942052864  =3B  =3B

<= br>
On 192.168.56.12

192.168.56.10  = =3BDC1-Cass  =3B  =3BRAC1  =3B  =3B  =3B  =3BDown &= nbsp=3B  =3B Normal  =3B28.06 GB  =3B  =3B  =3B  = =3B50.00%  =3B  =3B  =3B  =3B  =3B  =3B  =3B0 &= nbsp=3B  =3B  =3B  =3B  =3B  =3B  =3B  =3B &nbs= p=3B  =3B  =3B  =3B  =3B  =3B  =3B  =3B  = =3B  =3B  =3B  =3B  =3B =3B
192.168.56.11 &nb= sp=3BDC1-Cass  =3B  =3BRAC1  =3B  =3B  =3B  =3BDown=  =3B  =3B Normal  =3B31.59 GB  =3B  =3B  =3B  = =3B25.00%  =3B  =3B  =3B  =3B  =3B  =3B  =3B425= 35295865117307932921825928971026432  =3B  =3B  =3B
19= 2.168.56.12  =3BDC1-Cass  =3B  =3BRAC1  =3B  =3B  = =3B  =3BUp  =3B Normal  =3B29.02 GB  =3B  =3B  =3B =  =3B25.00%  =3B  =3B  =3B  =3B  =3B  =3B  = =3B85070591730234615865843651857942052864  =3B  =3B


I do not see anything in the logs that tells= me that there is a gossip issue.

nodetool in= fo
Token  =3B  =3B  =3B  =3B  =3B  =3B: 8= 5070591730234615865843651857942052864
Gossip active  =3B &nbs= p=3B: true
Thrift active  =3B  =3B: true
Load &= nbsp=3B  =3B  =3B  =3B  =3B  =3B : 29.05 GB
G= eneration No  =3B  =3B: 1365114563
Uptime (seconds) : 212= 7
Heap Memory (MB) : 848.71 / 7945.94
Exceptions  = =3B  =3B  =3B : 0
Key Cache  =3B  =3B  =3B &n= bsp=3B: size 2208 (bytes)=2C capacity 104857584 (bytes)=2C 1056 hits=2C 109= 9 requests=2C 0.961 recent hit rate=2C 14400 save period in seconds
Row Cache  =3B  =3B  =3B  =3B: size 0 (bytes)=2C capacit= y 0 (bytes)=2C 0 hits=2C 0 requests=2C NaN recent hit rate=2C 0 save period= in seconds

nodetool info
Tok= en  =3B  =3B  =3B  =3B  =3B  =3B: 42535295865117307= 932921825928971026432
Gossip active  =3B  =3B: true
=
Thrift active  =3B  =3B: true
Load  =3B  =3B=  =3B  =3B  =3B  =3B : 31.59 GB
Generation No &nb= sp=3B  =3B: 1364413038
Uptime (seconds) : 703904
He= ap Memory (MB) : 733.02 / 7945.94
Exceptions  =3B  =3B &n= bsp=3B : 1
Key Cache  =3B  =3B  =3B  =3B: size 36= 93312 (bytes)=2C capacity 104857584 (bytes)=2C 26071678 hits=2C 26616282 re= quests=2C 0.980 recent hit rate=2C 14400 save period in seconds
R= ow Cache  =3B  =3B  =3B  =3B: size 0 (bytes)=2C capacity 0 = (bytes)=2C 0 hits=2C 0 requests=2C NaN recent hit rate=2C 0 save period in = seconds



There is no firewall b= etween the nodes and I can reach each other on storage port. =3B
<= div>What else should I be looking at to find root cause? Appreciate your in= puts.

<= /body> = --_d3617236-89b3-48a6-9616-19a9d69208fe_--