cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Unreachable node, not in nodetool ring
Date Mon, 23 Jul 2012 08:11:03 GMT
Does anyone knows how to totally remove a dead node that only appears
when doing a "describe cluster" from the cli ?

I still got this issue in my production cluster.

Alain

2012/7/20 Alain RODRIGUEZ <arodrime@gmail.com>:
> Hi Aaron,
>
> I have repaired and cleanup both nodes already and I did it after any
> change on my ring (It tooks me a while btw :)).
>
> The node *.211 is actually out of the ring and out of my control
> 'cause I don't have the server anymore (EC2 instance terminated a few
> days ago).
>
> Alain
>
> 2012/7/20 aaron morton <aaron@thelastpickle.com>:
>> I would:
>>
>> * run repair on 10.58.83.109
>> * run cleanup on 10.59.21.241 (I assume this was the first node).
>>
>> It looks like 0.56.62.211 is out of the cluster.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 19/07/2012, at 9:37 PM, Alain RODRIGUEZ wrote:
>>
>> Not sure if this may help :
>>
>> nodetool -h localhost gossipinfo
>> /10.58.83.109
>>  RELEASE_VERSION:1.1.2
>>  RACK:1b
>>  LOAD:5.9384978406E10
>>  SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8
>>  DC:eu-west
>>  STATUS:NORMAL,85070591730234615865843651857942052864
>>  RPC_ADDRESS:0.0.0.0
>> /10.248.10.94
>>  RELEASE_VERSION:1.1.2
>>  LOAD:3.0128207422E10
>>  SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8
>>  STATUS:LEFT,0,1342866804032
>>  RPC_ADDRESS:0.0.0.0
>> /10.56.62.211
>>  RELEASE_VERSION:1.1.2
>>  LOAD:11594.0
>>  RACK:1b
>>  SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f
>>  DC:eu-west
>>  REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864
>>  STATUS:removed,170141183460469231731687303715884105727,1342453967415
>>  RPC_ADDRESS:0.0.0.0
>> /10.59.21.241
>>  RELEASE_VERSION:1.1.2
>>  RACK:1b
>>  LOAD:1.08667047094E11
>>  SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8
>>  DC:eu-west
>>  STATUS:NORMAL,0
>>  RPC_ADDRESS:0.0.0.0
>>
>> Story :
>>
>> I had 2 node cluster
>>
>> 10.248.10.94 Token 0
>> 10.59.21.241 Token 85070591730234615865843651857942052864
>>
>> Had to replace node 10.248.10.94 so I add 10.56.62.211 on token 0 - 1
>> (170141183460469231731687303715884105727). This failed, I removed
>> token.
>>
>> I repeat the previous operation with the node 10.59.21.241 and it went
>> fine. Next I decommissionned the node 10.248.10.94 and moved
>> 10.59.21.241 to the token 0.
>>
>> Now I am on the situation described before.
>>
>> Alain
>>
>>
>> 2012/7/19 Alain RODRIGUEZ <arodrime@gmail.com>:
>>
>> Hi, I wasn't able to see the token used currently by the 10.56.62.211
>>
>> (ghost node).
>>
>>
>> I already removed the token 6 days ago :
>>
>>
>> -> "Removing token 170141183460469231731687303715884105727 for
>> /10.56.62.211"
>>
>>
>> "- check in cassandra log. It is possible you see a log line telling
>>
>> you 10.56.62.211 and 10.59.21.241 o 10.58.83.109  share the same
>>
>> token"
>>
>>
>> Nothing like that in the logs
>>
>>
>> I tried the following without success :
>>
>>
>> $ nodetool -h localhost removetoken 170141183460469231731687303715884105727
>>
>> Exception in thread "main" java.lang.UnsupportedOperationException:
>>
>> Token not found.
>>
>> ...
>>
>>
>> I really thought this was going to work :-).
>>
>>
>> Any other ideas ?
>>
>>
>> Alain
>>
>>
>> PS : I heard that Octo is a nice company and you use Cassandra so I
>>
>> guess you're fine in there :-). I wish you the best thanks for your
>>
>> help.
>>
>>
>> 2012/7/19 Olivier Mallassi <omallassi@octo.com>:
>>
>> I got that a couple of time (due to DNS issues in our infra)
>>
>>
>> what you could try
>>
>> - check in cassandra log. It is possible you see a log line telling you
>>
>> 10.56.62.211 and 10.59.21.241 o 10.58.83.109  share the same token
>>
>> - if 10.56.62.211 is up, try decommission (via nodetool)
>>
>> - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1
>>
>> - use removetoken (via nodetool) to remove the token associated with
>>
>> 10.56.62.211. in case of failure, you can use removetoken -f instead.
>>
>>
>> then, the unreachable IP should have disappeared.
>>
>>
>>
>> HTH
>>
>>
>> On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ <arodrime@gmail.com>
>>
>> wrote:
>>
>>
>> Hi,
>>
>>
>> I tried to add a node a few days ago and it failed. I finally made it
>>
>> work with an other node but now when I describe cluster on cli I got
>>
>> this :
>>
>>
>> Cluster Information:
>>
>>   Snitch: org.apache.cassandra.locator.Ec2Snitch
>>
>>   Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>
>>   Schema versions:
>>
>>      UNREACHABLE: [10.56.62.211]
>>
>>      e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109]
>>
>>
>> And nodetool ring gives me :
>>
>>
>> Address         DC          Rack        Status State   Load
>>
>> Owns                Token
>>
>>
>>                    85070591730234615865843651857942052864
>>
>> 10.59.21.241    eu-west     1b          Up     Normal  101.17 GB
>>
>> 50.00%              0
>>
>> 10.58.83.109    eu-west     1b          Up     Normal  55.27 GB
>>
>> 50.00%              85070591730234615865843651857942052864
>>
>>
>> The point, as you can see, is that one of my node has twice the
>>
>> information of the second one. I have a RF = 2 defined.
>>
>>
>> My guess is that the token 0 node keep data for the unreachable node.
>>
>>
>> The IP of the unreachable node doesn't belong to me anymore, I have no
>>
>> access to this ghost node.
>>
>>
>> Does someone know how to completely remove this ghost node from my cluster
>>
>> ?
>>
>>
>> Thank you.
>>
>>
>> Alain
>>
>>
>> INFO :
>>
>>
>> On ubuntu (AMI Datastax 2.1 and 2.2)
>>
>> Cassandra 1.1.2 (upgraded from 1.0.9)
>>
>> 2 node cluster (+ the ghost one)
>>
>> RF = 2
>>
>>
>>
>>
>>
>> --
>>
>> ............................................................
>>
>> Olivier Mallassi
>>
>> OCTO Technology
>>
>> ............................................................
>>
>> 50, Avenue des Champs-Elysées
>>
>> 75008 Paris
>>
>>
>> Mobile: (33) 6 28 70 26 61
>>
>> Tél: (33) 1 58 56 10 00
>>
>> Fax: (33) 1 58 56 10 01
>>
>>
>> http://www.octo.com
>>
>> Octo Talks! http://blog.octo.com
>>
>>
>>
>>

Mime
View raw message