cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Kuo <colinkuo...@gmail.com>
Subject Re: decommissioning a cassandra node
Date Mon, 27 Oct 2014 15:17:32 GMT
Hi Tim,

The node with IP 94 is leaving. Maybe something wrong happens during
streaming data. You could use "nodetool netstats" on both nodes to monitor
if there is any streaming connection stuck.

Indeed, you could force remove the leaving node by shutting down it
directly. Then, perform "nodetool removenode" to remove dead node. But you
should understand you're taking the risk to lose data if your RF in cluster
is lower than 3 and data have not been fully synced. Therefore, remember to
sync data using repair before you're going to remove/decommission the node
in cluster.

Thanks!

On Mon, Oct 27, 2014 at 9:55 PM, Tim Dunphy <bluethundr@gmail.com> wrote:

> "Also, is there any document that explains what all the nodetool
>> abbreviations (UN, UL) stand for?"
>> --> The documentation is in the command output itself
>> Datacenter: datacenter1
>> =======================
>>
>> *Status=Up/Down*
>> *|/ State=Normal/Leaving/Joining/Moving*--  Address         Load
>> Tokens  Owns    Host ID                               Rack
>> UN  162.243.86.41   1.08 MB    1       0.1%
>>  e945f3b5-2e3e-4a20-b1bd-e30c474a7634  rack1
>> UL  162.243.109.94  1.28 MB    256     99.9%
>> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>> U = Up, D = Down
>> N = Normal, L = Leaving, J = Joining and M = Moving
>
>
> Ok, got it, thanks!
>
> Can someone suggest a good way to fix a node that is in an UL state?
>
> Thanks
> Tim
>
> On Mon, Oct 27, 2014 at 9:46 AM, DuyHai Doan <doanduyhai@gmail.com> wrote:
>
>> "Also, is there any document that explains what all the nodetool
>> abbreviations (UN, UL) stand for?"
>>
>> --> The documentation is in the command output itself
>>
>> Datacenter: datacenter1
>> =======================
>> *Status=Up/Down*
>> *|/ State=Normal/Leaving/Joining/Moving*
>> --  Address         Load       Tokens  Owns    Host ID
>>             Rack
>> UN  162.243.86.41   1.08 MB    1       0.1%
>>  e945f3b5-2e3e-4a20-b1bd-e30c474a7634  rack1
>> UL  162.243.109.94  1.28 MB    256     99.9%
>> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>>
>> U = Up, D = Down
>> N = Normal, L = Leaving, J = Joining and M = Moving
>>
>> On Mon, Oct 27, 2014 at 2:42 PM, Tim Dunphy <bluethundr@gmail.com> wrote:
>>
>>> As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is
>>>> causing the problem
>>>
>>>
>>> OK, that's an interesting observation.How do you fix a node that is an
>>> UL state? What causes this?
>>>
>>> Also, is there any document that explains what all the nodetool
>>> abbreviations (UN, UL) stand for?
>>>
>>> On Mon, Oct 27, 2014 at 5:46 AM, jivko donev <jivko_d88@yahoo.com>
>>> wrote:
>>>
>>>> As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is
>>>> causing the problem.
>>>>
>>>>
>>>>   On Sunday, October 26, 2014 11:57 PM, Tim Dunphy <
>>>> bluethundr@gmail.com> wrote:
>>>>
>>>>
>>>> Hey all,
>>>>
>>>>  I'm trying to decommission a node.
>>>>
>>>>  First I'm getting a status:
>>>>
>>>> [root@beta-new:/usr/local] #nodetool status
>>>> Note: Ownership information does not include topology; for complete
>>>> information, specify a keyspace
>>>> Datacenter: datacenter1
>>>> =======================
>>>> Status=Up/Down
>>>> |/ State=Normal/Leaving/Joining/Moving
>>>> --  Address         Load       Tokens  Owns    Host ID
>>>>               Rack
>>>> UN  162.243.86.41   1.08 MB    1       0.1%
>>>>  e945f3b5-2e3e-4a20-b1bd-e30c474a7634  rack1
>>>> UL  162.243.109.94  1.28 MB    256     99.9%
>>>> fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
>>>>
>>>>
>>>> But when I try to decommission the node I get this message:
>>>>
>>>> [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommission
>>>> nodetool: Failed to connect to '162.243.86.41:7199' -
>>>> NoSuchObjectException: 'no such object in table'.
>>>>
>>>> Yet I can telnet to that host on that port just fine:
>>>>
>>>> [root@beta-new:/usr/local] #telnet 162.243.86.41 7199
>>>> Trying 162.243.86.41...
>>>> Connected to 162.243.86.41.
>>>> Escape character is '^]'.
>>>>
>>>>
>>>> And I have verified that cassandra is running and accessible via cqlsh
>>>> on the other machine.
>>>>
>>>> What could be going wrong?
>>>>
>>>> Thanks
>>>> Tim
>>>>
>>>>
>>>> --
>>>> GPG me!!
>>>>
>>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> GPG me!!
>>>
>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>
>>>
>>
>
>
> --
> GPG me!!
>
> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>
>

Mime
View raw message