cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Low (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-7307) New nodes mark dead nodes as up for 10 minutes
Date Tue, 27 May 2014 22:00:02 GMT
Richard Low created CASSANDRA-7307:

             Summary: New nodes mark dead nodes as up for 10 minutes
                 Key: CASSANDRA-7307
             Project: Cassandra
          Issue Type: Bug
            Reporter: Richard Low

When doing a node replacement when other nodes are down we see the down nodes marked as up
for about 10 minutes. This means requests are routed to the dead nodes causing timeouts. It
also means replacing a node when multiple nodes from a replica set is extremely difficult
- the node usually tries to stream from a dead node and the replacement fails.

This isn't limited to host replacement. I did a simple test:

1. Create a 2 node cluster
2. Kill node 2
3. Start a 3rd node with a unique token (I used auto_bootstrap=false but I don't think this
is significant)

The 3rd node lists node 2 ( as up for almost 10 minutes:

INFO [main] 2014-05-27 14:28:24,753 (line 119) Logging initialized
INFO [GossipStage:1] 2014-05-27 14:28:31,492 (line 843) Node / is now
part of the cluster
INFO [GossipStage:1] 2014-05-27 14:28:31,495 (line 809) InetAddress /
is now UP
INFO [GossipTasks:1] 2014-05-27 14:37:44,526 (line 823) InetAddress /
is now DOWN

This repro is on 1.2.16.

This message was sent by Atlassian JIRA

View raw message