cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paulo Ricardo Motta Gomes <paulo.mo...@chaordicsystems.com>
Subject Re: Dead node seen as UP by replacement node
Date Wed, 12 Mar 2014 23:06:52 GMT
Some further info:

I'm not using Vnodes, so I'm using the 1.1 replace node trick of setting
the initial_token in the cassandra.yaml file to the value of the dead
node's token -1, and autobootstrap=true. However, according to the Apache
wiki (
https://wiki.apache.org/cassandra/Operations#For_versions_1.2.0_and_above),
on 1.2 you should actually remove the dead node from the ring, before
adding a replacement node.

Does that mean the trick of setting the initial token to the value of the
dead node's -1 (described in
http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node) is
not valid anymore in 1.2 without vnodes?


On Wed, Mar 12, 2014 at 5:57 PM, Paulo Ricardo Motta Gomes <
paulo.motta@chaordicsystems.com> wrote:

> Hello,
>
> I'm trying to replace a dead node using the procedure in [1], but the
> replacement node initially sees the dead node as UP, and after a few
> minutes the node is marked as DOWN again, failing the streaming/bootstrap
> procedure of the replacement node. This dead node is always seen as DOWN by
> the rest of the cluster.
>
> Could this be a bug? I can easily reproduce it in our production
> environment, but don't know if it's reproducible in a clean environment.
>
> Version: 1.2.13
>
> Here is the log from the replacement node (192.168.1.10 is the dead node):
>
>  INFO [GossipStage:1] 2014-03-12 20:25:41,089 Gossiper.java (line 843)
> Node /192.168.1.10 is now part of the cluster
>  INFO [GossipStage:1] 2014-03-12 20:25:41,090 Gossiper.java (line 809)
> InetAddress /192.168.1.10 is now UP
>  INFO [GossipTasks:1] 2014-03-12 20:34:54,238 Gossiper.java (line 823)
> InetAddress /192.168.1.10 is now DOWN
> ERROR [GossipTasks:1] 2014-03-12 20:34:54,240 AbstractStreamSession.java
> (line 110) Stream failed because /192.168.1.10 died or was
> restarted/removed (streams may still be active in background, but further
> streams won't be started)
>  WARN [GossipTasks:1] 2014-03-12 20:34:54,240 RangeStreamer.java (line
> 246) Streaming from /192.168.1.10 failed
> ERROR [GossipTasks:1] 2014-03-12 20:34:54,240 AbstractStreamSession.java
> (line 110) Stream failed because /192.168.1.10 died or was
> restarted/removed (streams may still be active in background, but further
> streams won't be started)
>  WARN [GossipTasks:1] 2014-03-12 20:34:54,241 RangeStreamer.java (line
> 246) Streaming from /192.168.1.10 failed
>
> [1]
> http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node
>
> Cheers,
>
> Paulo
>
> --
> *Paulo Motta*
>
> Chaordic | *Platform*
> *www.chaordic.com.br <http://www.chaordic.com.br/>*
> +55 48 3232.3200
> +55 83 9690-1314
>



-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br <http://www.chaordic.com.br/>*
+55 48 3232.3200
+55 83 9690-1314

Mime
View raw message