Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B0859EAB for ; Wed, 10 Oct 2012 19:38:38 +0000 (UTC) Received: (qmail 21328 invoked by uid 500); 10 Oct 2012 19:38:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 21300 invoked by uid 500); 10 Oct 2012 19:38:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 21290 invoked by uid 99); 10 Oct 2012 19:38:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Oct 2012 19:38:35 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rsiemens@greatergood.com designates 206.253.208.224 as permitted sender) Received: from [206.253.208.224] (HELO mail.yss4.com) (206.253.208.224) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Oct 2012 19:38:28 +0000 Subject: Re: How to replace a dead *seed* node while keeping quorum Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: multipart/alternative; boundary="Apple-Mail=_09D84F05-42BD-4763-A1D1-5208DDF9AB30" From: Ron Siemens In-Reply-To: <50511E01.6070503@globalrelay.net> Date: Wed, 10 Oct 2012 12:38:06 -0700 Cc: James Atwill , Edward Sargisson , Rob Coli Message-Id: References: <504FC77E.1000706@globalrelay.net> <50511E01.6070503@globalrelay.net> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1283) --Apple-Mail=_09D84F05-42BD-4763-A1D1-5208DDF9AB30 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 I witnessed the same behavior as reported by Edward and James. Removing the host from its own seed list does not solve the problem. = Removing it from config of all nodes and restarting each, then = restarting the failed node worked. Ron On Sep 12, 2012, at 4:42 PM, Edward Sargisson wrote: > I'm reposting my colleague's reply to Rob to the list (with James' = permission) in case others are interested. >=20 > I'll add to James' post below to say I don't believe we saw the = message that that slice of code would have printed. >=20 > " > Hey Rob, >=20 > Ed's AWOL right now and I'm not on u@c.a.o, but I can tell you that = when=20 > I removed the downed seed node from its own list of seed nodes in=20 > cassandra.yaml that it didn't join the existing ring nor did it get = any=20 > schemas or data from the existing ring; it felt like timeouts were=20 > happening. (IANA Cassandra wizard, so excuse my terminology = impedance.) >=20 > Changing the machine's hostname and giving it a new IP, it behaved as=20= > expected; joining the ring, syncing both schema and associated data. >=20 > Downed node is 1.1.4, the rest of the ring is 1.1.2. >=20 > I'm in a situation where I can revert the IP/hostname change and retry=20= > the scenario as needed if you've got any ideas. >=20 > HTH, >=20 > JAmes" >=20 > Cheers, > Edward >=20 > On 12-09-12 03:53 PM, Rob Coli wrote: >> On Tue, Sep 11, 2012 at 4:21 PM, Edward Sargisson >> wrote: >>> If the downed node is a seed node then neither of the replace a dead = node >>> procedures work (-Dcassandra.replace_token and taking = initial_token-1). The >>> ring remains split. >>> [...] >>> In other words, if the host name is on the seeds list then it = appears that >>> the rest of the ring refuses to bootstrap it. >> Close, but not exactly... >>=20 >> "./src/java/org/apache/cassandra/service/StorageService.java" line = 559 of 3090 >> " >> if (DatabaseDescriptor.isAutoBootstrap() >> && >> = DatabaseDescriptor.getSeeds().contains(FBUtilities.getBroadcastAddress()) >> && !SystemTable.isBootstrapped()) >> logger_.info("This node will not auto bootstrap because = it >> is configured to be a seed node."); >> " >>=20 >> getSeeds asks your seed provider for a list of seeds. If you are = using >> the SimpleSeedProvider, this basically turns the list from "seeds" in >> cassandra.yaml on the local node into a list of hosts. >>=20 >> So it isn't that the other nodes have this node in their seed list.. >> it's that the node you are replacing has itself in its own seed list, >> and shouldn't. I understand that it can be tricky in conf management >> tools to make seed nodes' seed lists not contain themselves, but I >> believe it is currently necessary in this case. >>=20 >> FWIW, it's unclear to me (and Aaron Morton, whose curiousity was >> apparently equally piqued and is looking into it further..) why >> exactly seed nodes shouldn't bootstrap. It's possible that they only >> shouldn't bootstrap without being in "hibernate" mode, and that the >> code just hasn't been re-written post replace_token/hibernate to say >> that it's ok for seed nodes to bootstrap as long as they hibernate... >>=20 >> =3DRob >>=20 >=20 > --=20 > Edward Sargisson > senior java developer > Global Relay >=20 > edward.sargisson@globalrelay.net >=20 >=20 > 866.484.6630=20 > New York | Chicago | Vancouver | London (+44.0800.032.9829) | = Singapore (+65.3158.1301) >=20 > Global Relay Archive supports email, instant messaging, BlackBerry, = Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, = Facebook and more.=20 >=20 > Ask about Global Relay Message =97 The Future of Collaboration in the = Financial Services World >=20 > All email sent to or from this address will be retained by Global = Relay=92s email archiving system. This message is intended only for the = use of the individual or entity to which it is addressed, and may = contain information that is privileged, confidential, and exempt from = disclosure under applicable law. Global Relay will not be liable for = any compliance or technical information provided herein. All trademarks = are the property of their respective owners. --Apple-Mail=_09D84F05-42BD-4763-A1D1-5208DDF9AB30 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252
=20 =20
I'm reposting my colleague's reply to Rob to the list (with James' permission) in case others are interested.

I'll add to James' post below to say I don't believe we saw the message that that slice of code would have printed.

"
Hey Rob,

Ed's AWOL right now and I'm not on u@c.a.o, but I can tell you that when=20
I removed the downed seed node from its own list of seed nodes in=20
cassandra.yaml that it didn't join the existing ring nor did it get any=20=

schemas or data from the existing ring; it felt like timeouts were=20
happening. (IANA Cassandra wizard, so excuse my terminology impedance.)

Changing the machine's hostname and giving it a new IP, it behaved as=20
expected; joining the ring, syncing both schema and associated data.

Downed node is 1.1.4, the rest of the ring is 1.1.2.

I'm in a situation where I can revert the IP/hostname change and retry=20=

the scenario as needed if you've got any ideas.

HTH,

   JAmes"

Cheers,
Edward

On 12-09-12 03:53 PM, Rob Coli = wrote:
On Tue, Sep 11, 2012 at 4:21 PM, Edward Sargisson
<edward.sargisson@glob=
alrelay.net> wrote:
If the downed node is a seed node then neither of =
the replace a dead node
procedures work (-Dcassandra.replace_token and taking initial_token-1). =
The
ring remains split.
[...]
In other words, if the host name is on the seeds list then it appears =
that
the rest of the ring refuses to bootstrap it.
Close, but not exactly...

"./src/java/org/apache/cassandra/service/StorageService.java" line 559 =
of 3090
"
if (DatabaseDescriptor.isAutoBootstrap()
                &&
=
DatabaseDescriptor.getSeeds().contains(FBUtilities.getBroadcastAddress())
                && !SystemTable.isBootstrapped())
            logger_.info("This node will not auto bootstrap because it
is configured to be a seed node.");
"

getSeeds asks your seed provider for a list of seeds. If you are using
the SimpleSeedProvider, this basically turns the list from "seeds" in
cassandra.yaml on the local node into a list of hosts.

So it isn't that the other nodes have this node in their seed list..
it's that the node you are replacing has itself in its own seed list,
and shouldn't. I understand that it can be tricky in conf management
tools to make seed nodes' seed lists not contain themselves, but I
believe it is currently necessary in this case.

FWIW, it's unclear to me (and Aaron Morton, whose curiousity was
apparently equally piqued and is looking into it further..) why
exactly seed nodes shouldn't bootstrap. It's possible that they only
shouldn't bootstrap without being in "hibernate" mode, and that the
code just hasn't been re-written post replace_token/hibernate to say
that it's ok for seed nodes to bootstrap as long as they hibernate...

=3DRob


--

Edward Sargisson

senior = java developer
Global Relay

edward.sargisson@globalrelay.net


866.484.6630 
New York | Chicago | Vancouver  =
London  (+44.0800.032.9829)  Singapore  (+65.3158.1301)

Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. 


Ask about Global Relay Message =97 The Future of Collaboration in the Financial Services = World


All email sent to or from this address will be retained by Global Relay=92s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law.  Global Relay will = not be liable for any compliance or technical information provided herein.  All trademarks are the property of their respective owners.


= --Apple-Mail=_09D84F05-42BD-4763-A1D1-5208DDF9AB30--