cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincent Rischmann" <vinc...@rischmann.fr>
Subject Re: gossipinfo contains two nodes dead for more than two years
Date Wed, 28 Aug 2019 16:10:06 GMT
Yep, they're not visible in both ring and status.

On Wed, Aug 28, 2019, at 17:08, Jeff Jirsa wrote:
> Based on what you've posted, I assume the instances are not visible in `nodetool ring`
or `nodetool status`, and the only reason you know they're still in gossipinfo is you see
them in the logs? If that's the case, then yes, I would do `nodetool assassinate`.
> 
> 
> 
> On Wed, Aug 28, 2019 at 7:33 AM Vincent Rischmann <vincent@rischmann.fr> wrote:
>> __
>> Hi,
>> 
>> while replacing a node in a cluster I saw this log:
>> 
>>  2019-08-27 16:35:31,439 Gossiper.java:995 - InetAddress /10.15.53.27 is now DOWN
>> 
>> it caught my attention because that ip address doesn't exist anymore in the cluster
and it hasn't for a long time.
>> 
>> After some reading I ran `nodetool gossipinfo` and I saw these entries which are
nodes that don't exist anymore:
>> 
>>  /10.15.53.27
>>  generation:1503480618
>>  heartbeat:26970
>>  STATUS:2:hibernate,true
>>  LOAD:26810:6.17363354147E11
>>  SCHEMA:101:d21b1e47-f226-3417-8de7-5802518ae824
>>  DC:10:DC1
>>  RACK:12:RAC1
>>  RELEASE_VERSION:6:2.1.18
>>  INTERNAL_IP:8:10.15.53.27
>>  RPC_ADDRESS:5:10.15.53.27
>>  SEVERITY:26972:0.0
>>  NET_VERSION:3:8
>>  HOST_ID:4:2488fccc-108a-4a9d-ad43-5e8b8b6ee17b
>>  TOKENS:1:<hidden>
>>  /10.5.1.16
>>  generation:1503636779
>>  heartbeat:324
>>  STATUS:2:hibernate,true
>>  LOAD:204:2.601990697532E12
>>  SCHEMA:14:d21b1e47-f226-3417-8de7-5802518ae824
>>  DC:10:DC1
>>  RACK:12:RAC1
>>  RELEASE_VERSION:6:2.1.18
>>  INTERNAL_IP:8:10.5.1.16
>>  RPC_ADDRESS:5:10.5.1.16
>>  SEVERITY:326:0.0
>>  NET_VERSION:3:8
>>  HOST_ID:4:2488fccc-108a-4a9d-ad43-5e8b8b6ee17b
>>  TOKENS:1:<hidden>
>> 
>> the generations are:
>> 
>> - Wed, 23 Aug 2017 09:30:18 GMT
>> - Fri, 25 Aug 2017 04:52:59 GMT
>> 
>> I don't remember what we did at that time but it looks like we botched something
while joining a node or something.
>> 
>> After reading https://thelastpickle.com/blog/2018/09/18/assassinate.html I'm thinking
of doing the following:
>> 
>> * nodetool removenode 10.15.53.27
>> * if it doesn't work for some reason: nodetool assassinate 10.15.53.27
>> 
>> Since those nodes have been long dead and don't appear in system.peer I don't anticipate
any problems but I'd like some confirmation that this can't break my cluster.
>> 
>> Thanks !
Mime
View raw message