Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 943AED45C for ; Sun, 12 Aug 2012 04:12:28 +0000 (UTC) Received: (qmail 6312 invoked by uid 500); 12 Aug 2012 04:12:26 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 5690 invoked by uid 500); 12 Aug 2012 04:12:14 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 5618 invoked by uid 99); 12 Aug 2012 04:12:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Aug 2012 04:12:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tyler@datastax.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vc0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Aug 2012 04:12:07 +0000 Received: by vcbfo14 with SMTP id fo14so2905315vcb.31 for ; Sat, 11 Aug 2012 21:11:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=gVyOB9M9awMzFM42Oow2FmLI+eR4UFW0Dqgs29NvaVc=; b=D/+kZSo/LMeqI/I7LuxkTGizaQ17WFcqUGPaQEN8Lvj/TOeA5LYqq3CjuPafFxsXrr CbWO4MI/Igvt087mFILn7aIlpGv+mz0FNIriuXEYtUi7dBF/ZCCRfvji/MHwoEJKrFy0 3npCBZgHXdNEualZLqcx1zi9d+ECv03GW1wTOwR2CzEbc7NXFUUoM3xDbGwGr0c0Coc1 Oile7PAu6weIrmka4PtUuSVAltqBQ5pHaiF2H2xCwdAW1+HDamQspUA8vDXAir8gHOl1 1PJCBKnDgfCqU0py0xYaE7KNQmC3J7NzBXqYC48jjGCCjoKPc6FGhfBrxrZAPfTPnMAd Qh/A== MIME-Version: 1.0 Received: by 10.52.35.15 with SMTP id d15mr5217495vdj.128.1344744706137; Sat, 11 Aug 2012 21:11:46 -0700 (PDT) Received: by 10.58.172.72 with HTTP; Sat, 11 Aug 2012 21:11:46 -0700 (PDT) In-Reply-To: <501C4FC2.9050502@globalrelay.net> References: <501C4FC2.9050502@globalrelay.net> Date: Sat, 11 Aug 2012 23:11:46 -0500 Message-ID: Subject: Re: Node doesn't rejoin ring after restart From: Tyler Hobbs To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf307ac79b85df9804c709c636 X-Gm-Message-State: ALoCoQk6hdO3wlAuoqcFsK88ABLHOGmDRwnFHd+LEHpWivbKs1hpsQrwixuNQnUMgSkCTeW2vfZJ X-Virus-Checked: Checked by ClamAV on apache.org --20cf307ac79b85df9804c709c636 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Make sure that your seed list is the same for every node. Just pick two of the three nodes and use those as the seeds everywhere. If that's not the issue, check your cassandra log to see if there are any exceptions during startup. On Fri, Aug 3, 2012 at 5:25 PM, Edward Sargisson < edward.sargisson@globalrelay.net> wrote: > Hi all, > I'm testing our procedures for handling some Cassandra failure scenarios > and I'm not understanding something. > > I'm testing on a 3 node cluster with a replication_factor of 3. > I stopped one of the nodes for 5 or so minutes and run some application > tests. Everything was fine. > > Then I started cassandra on that node again and it refuses to re-join the > ring. It can see itself as up but not the other nodes. The other nodes ca= n > see themselves but don't see it as up. > > I deliberately haven't followed any of the token replacement methods > outlined in the docs. I'm working on the assumption that a small outage o= n > one node shouldn't cause extraordinary action. > > Nor do I want to have to stop every node before bringing them up one by > one. > > What am I missing? Am I forced into those time consuming methods every > time I want to restart? > > Thoughts? > > Cheers, > Edward > > -- > > Edward Sargisson > > senior java developer > Global Relay > > edward.sargisson@globalrelay.net > > > *866.484.6630* > New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singap= ore > (+65.3158.1301) > > Global Relay Archive supports email, instant messaging, BlackBerry, > Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, > Facebook and more. > > > Ask about *Global Relay Message* > * =97 *The Future of Collaboration in the Financial Services World > > * > *All email sent to or from this address will be retained by Global > Relay=92s email archiving system. This message is intended only for the u= se > of the individual or entity to which it is addressed, and may contain > information that is privileged, confidential, and exempt from disclosure > under applicable law. Global Relay will not be liable for any compliance > or technical information provided herein. All trademarks are the propert= y > of their respective owners. > --=20 Tyler Hobbs DataStax --20cf307ac79b85df9804c709c636 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Make sure that your seed list is the same for every node.=A0 Just pick two = of the three nodes and use those as the seeds everywhere.

If that= 9;s not the issue, check your cassandra log to see if there are any excepti= ons during startup.

On Fri, Aug 3, 2012 at 5:25 PM, Edward Sargi= sson <edward.sargisson@globalrelay.net> wrote= :
=20 =20 =20
Hi all,
I'm testing our procedures for handling some Cassandra failure scenarios and I'm not understanding something.

I'm testing on a 3 node cluster with a replication_factor of 3.
I stopped one of the nodes for 5 or so minutes and run some application tests. Everything was fine.

Then I started cassandra on that node again and it refuses to re-join the ring. It can see itself as up but not the other nodes. The other nodes can see themselves but don't see it as up.

I deliberately haven't followed any of the token replacement method= s outlined in the docs. I'm working on the assumption that a small outage on one node shouldn't cause extraordinary action.

Nor do I want to have to stop every node before bringing them up one by one.

What am I missing? Am I forced into those time consuming methods every time I want to restart?

Thoughts?

Cheers,
Edward

--
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20

Edward Sargisson

senior java de= veloper
Global Relay

edward.sargisson@globalrela= y.net


866.484.6630=A0
New York | Chicago | Vancouver=A0
|=A0 London=A0 (+44.0800.032.9829)=A0 |=A0 Singapore=A0 (+65.3158.1301)

Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more.=A0


<= /a>Ask about Global Relay Message =97 <= span lang=3D"en-US">The Future of Collaboration in the Financial Services World


All email sent to or from this address will be retained by Global Relay=92s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law.=A0 Global Relay will not be liable for any compliance or technical information provided herein.=A0 All trademarks are the property of their respective owners.




--
Tyler Hobbs
DataStax
<= br> --20cf307ac79b85df9804c709c636--