cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8138) replace_address cannot find node to be replaced node after seed node restart
Date Thu, 06 Nov 2014 17:47:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200541#comment-14200541
] 

Brandon Williams commented on CASSANDRA-8138:
---------------------------------------------

I think I'd much rather say that the edge case of a node dying, and then a full cluster restart
(rolling would still work) is just not supported, rather than make such invasive changes to
support replacement under such strange and rare conditions.  If that happens, it's time to
assassinate the node and bootstrap another one.

> replace_address cannot find node to be replaced node after seed node restart
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8138
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8138
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Oleg Anastasyev
>            Assignee: Brandon Williams
>         Attachments: ReplaceAfterSeedRestart.txt
>
>
> If a node failed and a cluster was restarted (which is common case on massive outages),
replace_address fails with
> {code}
> Caused by: java.lang.RuntimeException: Cannot replace_address /172.19.56.97 because it
doesn't exist in gossip
> jvm 1    | 	at org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:472)
> jvm 1    | 	at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:724)
> jvm 1    | 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:686)
> jvm 1    | 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:562)
> {code}
> Although neccessary information is saved in system tables on seed nodes, it is not loaded
to gossip on seed node, so a replacement node cannot get this info.
> Attached patch loads all information from system tables to gossip with generation 0 and
fixes some bugs around this info on shadow gossip round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message