Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C7DA9106E3 for ; Wed, 25 Sep 2013 20:13:04 +0000 (UTC) Received: (qmail 77907 invoked by uid 500); 25 Sep 2013 20:13:01 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 77885 invoked by uid 500); 25 Sep 2013 20:13:01 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 77877 invoked by uid 99); 25 Sep 2013 20:13:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Sep 2013 20:13:01 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of skye.book@gmail.com designates 209.85.213.180 as permitted sender) Received: from [209.85.213.180] (HELO mail-ye0-f180.google.com) (209.85.213.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Sep 2013 20:12:53 +0000 Received: by mail-ye0-f180.google.com with SMTP id m15so77474yen.39 for ; Wed, 25 Sep 2013 13:12:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-type:message-id:mime-version:subject:date:references :to:in-reply-to; bh=KzOmnoF8QRQCdGZJTlS5xp5Cw3oe4VMLStrKP1Ov53M=; b=Zyrvl/+MNS0mKhS/apvZpzqltHk88BLDNJA1tEb3UEcY7wc3KsciClAUt0jqqqfsn9 8wMIOmzCwVyplqnZHgr/wnnABv8UzgjCJ7+zvojIxTb+BPSgQd5c9RECeFPvDZbM5H+a TZxZQoNOkyYBx+smvsBuWg33brKWfBcU+ESV9lWQGbiggoeVt3pmuSzYNnt2SVnoNNdL 4VYa09r2HGugIRzl7TVNGZlyLk7qUfgtEVAG0PJy3RKLsMWlhZTXHzZSJaucANJFVo0J cd9C76VeXUV1T57G5WIhxoFejZypnknYtnlhwolmWUYXUyeHW6Jw60cVtEXczToHDbux 6v1A== X-Received: by 10.236.85.237 with SMTP id u73mr7987676yhe.67.1380139952698; Wed, 25 Sep 2013 13:12:32 -0700 (PDT) Received: from [10.0.1.26] ([38.117.157.75]) by mx.google.com with ESMTPSA id u43sm55821784yhb.4.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 25 Sep 2013 13:12:28 -0700 (PDT) From: Skye Book Content-Type: multipart/alternative; boundary="Apple-Mail=_544AB182-E1D3-4F3D-B688-8B66312F6BA7" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Re: Nodes not added to existing cluster Date: Wed, 25 Sep 2013 16:12:26 -0400 References: <7174577A-E0A3-4111-8E4E-D466641A38CB@gmail.com> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1510) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_544AB182-E1D3-4F3D-B688-8B66312F6BA7 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Thank you, both Michael and Robert for your suggestions. I actually saw = 5760, but we were running on 2.0.0, which it seems like this was fixed = in. That said, I noticed that my Chef scripts were failing to set the = broadcast_address correctly, which I'm guessing is the cause of the = problem, fixing that and trying a redeploy. I am curious, though, how = any of this worked in the first place spread across three AZ's without = that being set? -Skye On Sep 25, 2013, at 3:56 PM, Robert Coli wrote: > On Wed, Sep 25, 2013 at 12:41 PM, Skye Book = wrote: > I have a three node cluster using the EC2 Multi-Region Snitch = currently operating only in US-EAST. On having a node go down this = morning, I started a new node with an identical configuration, except = for the seed list, the listen address and the rpc address. The new node = comes up and creates its own cluster rather than joining the = pre-existing ring. I've tried creating a node both before ad after = using `nodetool remove` for the bad node, each time with the same = result. >=20 > What version of Cassandra? >=20 > This particular confusing behavior is fixed upstream, in a version you = should not deploy to production yet. Take some solace, however, that you = may be the last Cassandra administrator to die for a broken code path! >=20 > https://issues.apache.org/jira/browse/CASSANDRA-5768 >=20 > Does anyone have any suggestions for where to look that might put me = on the right track? >=20 > It must be that your seed list is wrong in some way, or your node = state is wrong. If you're trying to bootstrap a node, note that you = can't bootstrap a node when it is in its own seed list. >=20 > If you have installed Cassandra via debian package, there is a = possibility that your node has started before you explicitly started it. = If so, it might have invalid node state. >=20 > Have you tried wiping the data directory and trying again? >=20 > What is your seed list? Are you sure the new node can reach the seeds = on the network layer? >=20 > =3DRob --Apple-Mail=_544AB182-E1D3-4F3D-B688-8B66312F6BA7 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=iso-8859-1 Thank you, both Michael and Robert for your suggestions.  I actually saw 5760, but we were running on 2.0.0, which it seems like this was fixed in.

That said, I noticed that my Chef scripts were failing to set the broadcast_address correctly, which I'm guessing is the cause of the problem, fixing that and trying a redeploy.  I am curious, though, how any of this worked in the first place spread across three AZ's without that being set?

-Skye

On Sep 25, 2013, at 3:56 PM, Robert Coli <rcoli@eventbrite.com> wrote:

On Wed, Sep 25, 2013 at 12:41 PM, Skye Book <skye.book@gmail.com> wrote:
I have a three node cluster using the EC2 Multi-Region Snitch currently operating only in US-EAST.  On having a node go down this morning, I started a new node with an identical configuration, except for the seed list, the listen address and the rpc address.  The new node comes up and creates its own cluster rather than joining the pre-existing ring.  I've tried creating a node both before ad after using `nodetool remove` for the bad node, each time with the same result.

What version of Cassandra?

This particular confusing behavior is fixed upstream, in a version you should not deploy to production yet. Take some solace, however, that you may be the last Cassandra administrator to die for a broken code path!


Does anyone have any suggestions for where to look that might put me on the right track?

It must be that your seed list is wrong in some way, or your node state is wrong. If you're trying to bootstrap a node, note that you can't bootstrap a node when it is in its own seed list.

If you have installed Cassandra via debian package, there is a possibility that your node has started before you explicitly started it. If so, it might have invalid node state.

Have you tried wiping the data directory and trying again?

What is your seed list? Are you sure the new node can reach the seeds on the network layer?

=Rob

--Apple-Mail=_544AB182-E1D3-4F3D-B688-8B66312F6BA7--