cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10134) Always require replace_address to replace existing address
Date Fri, 15 Apr 2016 15:19:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243094#comment-15243094
] 

Sam Tunnicliffe commented on CASSANDRA-10134:
---------------------------------------------

I should also mention that this ticket also uncovered a limitation in ccmlib as it's used
by dtests. Currently, there's no way (that I could find at least) to specify a seed address
directly, the cluster's seed list being a list of {{Node}} instances. When generating node
config, {{Cluster}} then always uses the storage (i.e. listen) address in the seed list. This
is a problem when {{broadcast_address != listen_address}}, as gossip is broadcast-centric.
I found that {{snitch_test}} would fail because it specifies a single seed and that node would
get stuck in SR, waiting for a response from it's own {{listen_address}}. The correct behaviour
would be to recognise that the only entry in the seed list was itself and skip the shadow
round completely. I've pushed a change to ccm [here|https://github.com/pcmanus/ccm/compare/master...beobal:10134]
for this and dtest runs should use that branch.


> Always require replace_address to replace existing address
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-10134
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10134
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Distributed Metadata
>            Reporter: Tyler Hobbs
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.x
>
>
> Normally, when a node is started from a clean state with the same address as an existing
down node, it will fail to start with an error like this:
> {noformat}
> ERROR [main] 2015-08-19 15:07:51,577 CassandraDaemon.java:554 - Exception encountered
during startup
> java.lang.RuntimeException: A node with address /127.0.0.3 already exists, cancelling
join. Use cassandra.replace_address if you want to replace this node.
> 	at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:543)
~[main/:na]
> 	at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:783)
~[main/:na]
> 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:720) ~[main/:na]
> 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:611) ~[main/:na]
> 	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378) [main/:na]
> 	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:537) [main/:na]
> 	at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:626) [main/:na]
> {noformat}
> However, if {{auto_bootstrap}} is set to false or the node is in its own seed list, it
will not throw this error and will start normally.  The new node then takes over the host
ID of the old node (even if the tokens are different), and the only message you will see is
a warning in the other nodes' logs:
> {noformat}
> logger.warn("Changing {}'s host ID from {} to {}", endpoint, storedId, hostId);
> {noformat}
> This could cause an operator to accidentally wipe out the token information for a down
node without replacing it.  To fix this, we should check for an endpoint collision even if
{{auto_bootstrap}} is false or the node is a seed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message