incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Skye Book <skye.b...@gmail.com>
Subject Re: Nodes not added to existing cluster
Date Thu, 26 Sep 2013 05:13:05 GMT
Hi Aaron, thanks for the clarification.

As might be expected, having the broadcast_address fixed hasn't fixed anything.  What I did
find after writing my last email is that output.log is littered with these:

 INFO 05:03:49,015 Cannot handshake version with /aa.bb.cc.dd
 INFO 05:03:49,017 Handshaking version with /aa.bb.cc.dd
 INFO 05:03:49,803 Cannot handshake version with /ww.xx.yy.zz
 INFO 05:03:49,805 Handshaking version with /ww.xx.yy.zz

The two addresses that it is unable to handshake with are the other two addresses of nodes
in the cluster I'm unable to join.  I started thinking that maybe EC2 was having an-advertised
problem communicating between AZ's but bringing up nodes in both of the other availability
zones resulted in the same wrong behavior.

I've gist'd my cassandra.yaml, its pretty standard and hasn't caused an issue in the past
for me.  https://gist.github.com/skyebook/ec9364cdcec02e803ffc

Skye Book
http://skyebook.net -- @sbook

On Sep 26, 2013, at 12:34 AM, Aaron Morton <aaron@thelastpickle.com> wrote:

>>  I am curious, though, how any of this worked in the first place spread across three
AZ's without that being set?
> boradcast_address is only needed when you are going cross region (IIRC it's the EC2MultiRegionSnitch)
that sets it. 
> 
> As rob said, make sure the seed list includes on of the other nodes and that the cluster_name
set. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 26/09/2013, at 8:12 AM, Skye Book <skye.book@gmail.com> wrote:
> 
>> Thank you, both Michael and Robert for your suggestions.  I actually saw 5760, but
we were running on 2.0.0, which it seems like this was fixed in.
>> 
>> That said, I noticed that my Chef scripts were failing to set the broadcast_address
correctly, which I'm guessing is the cause of the problem, fixing that and trying a redeploy.
 I am curious, though, how any of this worked in the first place spread across three AZ's
without that being set?
>> 
>> -Skye
>> 
>> On Sep 25, 2013, at 3:56 PM, Robert Coli <rcoli@eventbrite.com> wrote:
>> 
>>> On Wed, Sep 25, 2013 at 12:41 PM, Skye Book <skye.book@gmail.com> wrote:
>>> I have a three node cluster using the EC2 Multi-Region Snitch currently operating
only in US-EAST.  On having a node go down this morning, I started a new node with an identical
configuration, except for the seed list, the listen address and the rpc address.  The new
node comes up and creates its own cluster rather than joining the pre-existing ring.  I've
tried creating a node both before ad after using `nodetool remove` for the bad node, each
time with the same result.
>>> 
>>> What version of Cassandra?
>>> 
>>> This particular confusing behavior is fixed upstream, in a version you should
not deploy to production yet. Take some solace, however, that you may be the last Cassandra
administrator to die for a broken code path!
>>> 
>>> https://issues.apache.org/jira/browse/CASSANDRA-5768
>>> 
>>> Does anyone have any suggestions for where to look that might put me on the right
track?
>>> 
>>> It must be that your seed list is wrong in some way, or your node state is wrong.
If you're trying to bootstrap a node, note that you can't bootstrap a node when it is in its
own seed list.
>>> 
>>> If you have installed Cassandra via debian package, there is a possibility that
your node has started before you explicitly started it. If so, it might have invalid node
state.
>>> 
>>> Have you tried wiping the data directory and trying again?
>>> 
>>> What is your seed list? Are you sure the new node can reach the seeds on the
network layer?
>>> 
>>> =Rob
>> 
> 


Mime
View raw message