incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Unreachable node, not in nodetool ring
Date Thu, 19 Jul 2012 09:37:34 GMT
Not sure if this may help :

nodetool -h localhost gossipinfo
/10.58.83.109
  RELEASE_VERSION:1.1.2
  RACK:1b
  LOAD:5.9384978406E10
  SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8
  DC:eu-west
  STATUS:NORMAL,85070591730234615865843651857942052864
  RPC_ADDRESS:0.0.0.0
/10.248.10.94
  RELEASE_VERSION:1.1.2
  LOAD:3.0128207422E10
  SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8
  STATUS:LEFT,0,1342866804032
  RPC_ADDRESS:0.0.0.0
/10.56.62.211
  RELEASE_VERSION:1.1.2
  LOAD:11594.0
  RACK:1b
  SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f
  DC:eu-west
  REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864
  STATUS:removed,170141183460469231731687303715884105727,1342453967415
  RPC_ADDRESS:0.0.0.0
/10.59.21.241
  RELEASE_VERSION:1.1.2
  RACK:1b
  LOAD:1.08667047094E11
  SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8
  DC:eu-west
  STATUS:NORMAL,0
  RPC_ADDRESS:0.0.0.0

Story :

I had 2 node cluster

10.248.10.94 Token 0
10.59.21.241 Token 85070591730234615865843651857942052864

Had to replace node 10.248.10.94 so I add 10.56.62.211 on token 0 - 1
(170141183460469231731687303715884105727). This failed, I removed
token.

I repeat the previous operation with the node 10.59.21.241 and it went
fine. Next I decommissionned the node 10.248.10.94 and moved
10.59.21.241 to the token 0.

Now I am on the situation described before.

Alain


2012/7/19 Alain RODRIGUEZ <arodrime@gmail.com>:
> Hi, I wasn't able to see the token used currently by the 10.56.62.211
> (ghost node).
>
> I already removed the token 6 days ago :
>
> -> "Removing token 170141183460469231731687303715884105727 for /10.56.62.211"
>
> "- check in cassandra log. It is possible you see a log line telling
> you 10.56.62.211 and 10.59.21.241 o 10.58.83.109  share the same
> token"
>
> Nothing like that in the logs
>
> I tried the following without success :
>
> $ nodetool -h localhost removetoken 170141183460469231731687303715884105727
> Exception in thread "main" java.lang.UnsupportedOperationException:
> Token not found.
> ...
>
> I really thought this was going to work :-).
>
> Any other ideas ?
>
> Alain
>
> PS : I heard that Octo is a nice company and you use Cassandra so I
> guess you're fine in there :-). I wish you the best thanks for your
> help.
>
> 2012/7/19 Olivier Mallassi <omallassi@octo.com>:
>> I got that a couple of time (due to DNS issues in our infra)
>>
>> what you could try
>> - check in cassandra log. It is possible you see a log line telling you
>> 10.56.62.211 and 10.59.21.241 o 10.58.83.109  share the same token
>> - if 10.56.62.211 is up, try decommission (via nodetool)
>> - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1
>> - use removetoken (via nodetool) to remove the token associated with
>> 10.56.62.211. in case of failure, you can use removetoken -f instead.
>>
>> then, the unreachable IP should have disappeared.
>>
>>
>> HTH
>>
>> On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ <arodrime@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> I tried to add a node a few days ago and it failed. I finally made it
>>> work with an other node but now when I describe cluster on cli I got
>>> this :
>>>
>>> Cluster Information:
>>>    Snitch: org.apache.cassandra.locator.Ec2Snitch
>>>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>>    Schema versions:
>>>       UNREACHABLE: [10.56.62.211]
>>>       e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109]
>>>
>>> And nodetool ring gives me :
>>>
>>> Address         DC          Rack        Status State   Load
>>> Owns                Token
>>>
>>>                     85070591730234615865843651857942052864
>>> 10.59.21.241    eu-west     1b          Up     Normal  101.17 GB
>>> 50.00%              0
>>> 10.58.83.109    eu-west     1b          Up     Normal  55.27 GB
>>> 50.00%              85070591730234615865843651857942052864
>>>
>>> The point, as you can see, is that one of my node has twice the
>>> information of the second one. I have a RF = 2 defined.
>>>
>>> My guess is that the token 0 node keep data for the unreachable node.
>>>
>>> The IP of the unreachable node doesn't belong to me anymore, I have no
>>> access to this ghost node.
>>>
>>> Does someone know how to completely remove this ghost node from my cluster
>>> ?
>>>
>>> Thank you.
>>>
>>> Alain
>>>
>>> INFO :
>>>
>>> On ubuntu (AMI Datastax 2.1 and 2.2)
>>> Cassandra 1.1.2 (upgraded from 1.0.9)
>>> 2 node cluster (+ the ghost one)
>>> RF = 2
>>
>>
>>
>>
>> --
>> ............................................................
>> Olivier Mallassi
>> OCTO Technology
>> ............................................................
>> 50, Avenue des Champs-Elysées
>> 75008 Paris
>>
>> Mobile: (33) 6 28 70 26 61
>> Tél: (33) 1 58 56 10 00
>> Fax: (33) 1 58 56 10 01
>>
>> http://www.octo.com
>> Octo Talks! http://blog.octo.com
>>
>>

Mime
View raw message