incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Ghost nodes
Date Tue, 05 Mar 2013 14:00:45 GMT
"try assasinate from the jmx?
http://nartax.com/2012/09/assassinate-cassandra-node/"

I finally used this solution... It always solves the problems of ghost
nodes :D. Last time I had unreachable nodes while describing cluster in CLI
(as described in the link) and I used the jmx unsafeAssassinateEnpoint.
This time it was a bit different since the schema was good an sync between
all the nodes, but this function solved this new issue.

Thanks for the answer even if I was also hopping to understand what
happened and not just "Assassinate" the problem, at least my prod is now ok.

Alain


2013/3/5 Jason Wee <peichieh@gmail.com>

> try assasinate from the jmx?
> http://nartax.com/2012/09/assassinate-cassandra-node/
>
> or try cassandra -Dcassandra.load_ring_state=false
> http://www.datastax.com/docs/1.0/references/cassandra#options
>
>
> On Tue, Mar 5, 2013 at 6:54 PM, Alain RODRIGUEZ <arodrime@gmail.com>wrote:
>
>> Any clue on this ?
>>
>>
>> 2013/2/25 Alain RODRIGUEZ <arodrime@gmail.com>
>>
>>> Hi,
>>>
>>> I am having issues after decommissioning 3 nodes one by one of my 1.1.6
>>> C* cluster (RF=3):
>>>
>>> On the "c.164" node, which was added a week after removing the 3 nodes,
>>> with gossipinfo I have:
>>>
>>>  /a.135
>>>   RPC_ADDRESS:0.0.0.0
>>>   STATUS:NORMAL,127605887595351923798765477786913079296
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   DC:eu-west
>>>   LOAD:3.40954135223E11
>>> /b.173
>>>   RPC_ADDRESS:0.0.0.0
>>>   STATUS:NORMAL,0
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   DC:eu-west
>>>   LOAD:3.32757832183E11
>>> /c.164
>>>   RPC_ADDRESS:0.0.0.0
>>>   STATUS:NORMAL,85070591730234615865843651857942052864
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   DC:eu-west
>>>   LOAD:2.93726484252E11
>>> /d.6
>>>   RPC_ADDRESS:0.0.0.0
>>>   STATUS:NORMAL,42535295865117307932921825928971026432
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   DC:eu-west
>>>   LOAD:2.85020693654E11
>>>
>>>
>>> On the 3 other nodes I see this:
>>>
>>>
>>> /a.135
>>>   RPC_ADDRESS:0.0.0.0
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   STATUS:NORMAL,127605887595351923798765477786913079296
>>>   DC:eu-west
>>>   LOAD:3.40974023487E11
>>> /10.64.167.32
>>>   RPC_ADDRESS:0.0.0.0
>>>   SCHEMA:d9adcce3-09ed-3e7f-a6a3-147d4283ed15
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   STATUS:LEFT,28356863910078203714492389662765613056,1359823927010
>>>   DC:eu-west
>>>   LOAD:1.47947624544E11
>>> /10.250.202.154
>>>    RPC_ADDRESS:0.0.0.0
>>>   SCHEMA:d9adcce3-09ed-3e7f-a6a3-147d4283ed15
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   STATUS:LEFT,85070591730234615865843651857942052863,1359808901882
>>>   DC:eu-west
>>>   LOAD:1.45049060742E11
>>> /b.173
>>>   RPC_ADDRESS:0.0.0.0
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   STATUS:NORMAL,0
>>>   DC:eu-west
>>>   LOAD:3.32760540235E11
>>> /c.164
>>>   RPC_ADDRESS:0.0.0.0
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   LOAD:2.93751485625E11
>>>   DC:eu-west
>>>   STATUS:NORMAL,85070591730234615865843651857942052864
>>> /10.64.103.228
>>>   RPC_ADDRESS:0.0.0.0
>>>   SCHEMA:d9adcce3-09ed-3e7f-a6a3-147d4283ed15
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   STATUS:LEFT,141784319550391032739561396922763706367,1359893766266
>>>   DC:eu-west
>>>   LOAD:2.46247802646E11
>>> /d.6
>>>   RPC_ADDRESS:0.0.0.0
>>>   SCHEMA:49aee81e-7c46-31bd-8e4b-dfd07d74d94c
>>>   RELEASE_VERSION:1.1.6
>>>   RACK:1b
>>>   STATUS:NORMAL,42535295865117307932921825928971026432
>>>   DC:eu-west
>>>   LOAD:2.85042986093E11
>>>
>>>
>>> Since I removed this 3 nodes (marked as "left") at least 3 weeks ago,
>>> shouldn't gossip have them totally removed for a while ?
>>>
>>> The "c.164", node that doesn't show nodes that left the ring in
>>> gosssipinfo, is logging every minute the following:
>>>
>>> ...
>>>  INFO [GossipStage:1] 2013-02-25 10:18:56,269 Gossiper.java (line 830)
>>> InetAddress /10.64.167.32 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:18:56,283 Gossiper.java (line 830)
>>> InetAddress /10.250.202.154 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:18:56,297 Gossiper.java (line 830)
>>> InetAddress /10.64.103.228 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:19:57,700 Gossiper.java (line 830)
>>> InetAddress /10.64.167.32 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:19:57,721 Gossiper.java (line 830)
>>> InetAddress /10.250.202.154 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:19:57,742 Gossiper.java (line 830)
>>> InetAddress /10.64.103.228 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:20:58,722 Gossiper.java (line 830)
>>> InetAddress /10.64.167.32 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:20:58,739 Gossiper.java (line 830)
>>> InetAddress /10.250.202.154 is now dead.
>>>  INFO [GossipStage:1] 2013-02-25 10:20:58,754 Gossiper.java (line 830)
>>> InetAddress /10.64.103.228 is now dead.
>>> ...
>>>
>>> All this look a bit weird to me, is this normal ?
>>>
>>> Alain
>>>
>>>
>>>
>>>
>>
>

Mime
View raw message