cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nayden kolev (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-7825) node decommission leaves ghost nodes in system.peers table and JMX
Date Mon, 25 Aug 2014 17:12:58 GMT
nayden kolev created CASSANDRA-7825:

             Summary: node decommission leaves ghost nodes in system.peers table and JMX
                 Key: CASSANDRA-7825
             Project: Cassandra
          Issue Type: Bug
         Environment: OS: Ubuntu 12.04.4 LTS
Cassandra: ReleaseVersion:
DSE 4.5.1
OpsCenter: 5.0.0

            Reporter: nayden kolev

I have a 4-node cluster (split in 2 DCs) running DSE 4.5.1, C* I needed to cycle
a node (add a new node and remove one). I followed this doc (more specifically steps 1 and

After the decom, the decommissioned node logged this:

INFO [RMI TCP Connection(17)-] 2014-08-23 09:57:08,243 (line
141) Stop listening to thrift clients
INFO [RMI TCP Connection(17)-] 2014-08-23 09:57:08,269 (line 182) Stop
listening for CQL clients
INFO [RMI TCP Connection(17)-] 2014-08-23 09:57:08,270 (line 1279)
Announcing shutdown
INFO [RMI TCP Connection(17)-] 2014-08-23 09:57:10,271 (line
683) Waiting for messaging service to quiesce
INFO [ACCEPT-/] 2014-08-23 09:57:10,272 (line 923) MessagingService
has terminated the accept() thread
INFO [RMI TCP Connection(17)-] 2014-08-23 09:57:10,280 (line

The decommissioned node no longer appears in OpsCenter, and 'nodetool status' shows it gone
from the cluster as well, with the remaining 4 nodes un UN state.

All is good... Then I noticed that the DownEndpointCount (still) shows as 1 - using a JMX
console, and browsing to, FailureDetector, Attributes, DownEdpointCount.
While there, I also noticed that SimpleStates shows the decommissioned node as down, and the
AllEndpointStates shows it as STATUS:LEFT

I tried running a 'nodetool removenode decom-node's-host-id', but it failed with "Host ID
not found", which I expected, given I decommissioned it and it does not show in nodetool status.

nodetool describecluster lists only the expected 4 nodes (does not show the decommissioned

checking the system.peers table lists the decomm-ed node with a null host_id, rack, release_version,
rpc_address, schema_version, etc.

Adding JVM_OPTS="$JVM_OPTS -Dcassandra.load_ring_state=false" to the as suggested

does not help. I have actually tried this before, when I was decommissioning a node on an
older C* version and it worked, but now it does nothing. If I delete the row mentioning the
decommissioned node from the system.peers table it stays out of there until the next dse service

This is causing apps to timeout, since they get a invalid node's IP... As a workaround I remove
the entry from the peers table, but it is not permanent...

This message was sent by Atlassian JIRA

View raw message