cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Capwell (Jira)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-16213) Cannot replace_address /X because it doesn't exist in gossip
Date Mon, 26 Oct 2020 18:11:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17220881#comment-17220881
] 

David Capwell commented on CASSANDRA-16213:
-------------------------------------------

Thanks for the review [~paulo]

bq.  I would just limit propagating state about the downed host only during shadow gossip
response

I will test this out

bq. I'm not very familiar with in-jvm dtest infrastructure so it would be nice to get a pair
of eyes on that to make sure the framework changes look good.

Agree.  Jvm-dtest forked a lot of CassandraDaemon so took a while to make sure the logic matched,
this was a good chunk of this patch =(.

> Cannot replace_address /X because it doesn't exist in gossip
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-16213
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16213
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Cluster/Membership
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> We see this exception around nodes crashing and trying to do a host replacement; this
error appears to be correlated around multiple node failures.
> A simplified case to trigger this is the following
> *) Have a N node cluster
> *) Shutdown all N nodes
> *) Bring up N-1 nodes (at least 1 seed, else replace seed)
> *) Host replace the N-1th node -> this will fail with the above
> The reason this happens is that the N-1th node isn’t gossiping anymore, and the existing
nodes do not have its details in gossip (but have the details in the peers table), so the
host replacement fails as the node isn’t known in gossip.
> This affects all versions (tested 3.0 and trunk, assume 2.2 as well)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message