Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Thu, 9 Mar 2017 05:11:37 +0000 (UTC)
From: "Arijit (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.13049224.1488967745000.19065.1489036297971@Atlassian.JIRA>
In-Reply-To: <JIRA.13049224.1488967745000@Atlassian.JIRA>
References: <JIRA.13049224.1488967745000@Atlassian.JIRA> <JIRA.13049224.1488967745607@jira-lw-us.apache.org>
Subject: [jira] [Commented] (CASSANDRA-13308) Hint files not being deleted
 on nodetool decommission
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Thu, 09 Mar 2017 05:11:42 -0000


    [ https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902506#comment-15902506 ] 

Arijit commented on CASSANDRA-13308:
------------------------------------

The stack and "logs" were for a non-leaving node. The "logs_decommissioned_node" file was for the leaving node. If you look at the timestamps, you will see that on 06:04:33, the leaving node says DECOMMISSIONED, but the "logs" file shows hinted handoff occurring at 07:01:43. The host id in the hints file corresponds to that of the leaving node.

And you are correct! The cluster had a history of stopping Cassandra on nodes for a while before starting and running "nodetool decommission" on them. I believe this was done a few times before, and it caused the same condition described above at least twice. The nodes might have been done for several hours before the decommission.

> Hint files not being deleted on nodetool decommission
> -----------------------------------------------------
>
>                 Key: CASSANDRA-13308
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13308
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>         Environment: Using Cassandra version 3.0.9
>            Reporter: Arijit
>         Attachments: 28207.stack, logs, logs_decommissioned_node
>
>
> How to reproduce the issue I'm seeing:
> Shut down Cassandra on one node of the cluster and wait until we accumulate a ton of hints. Start Cassandra on the node and immediately run "nodetool decommission" on it.
> The node streams its replicas and marks itself as DECOMMISSIONED, but other nodes do not seem to see this message. "nodetool status" shows the decommissioned node in state "UL" on all other nodes (it is also present in system.peers), and Cassandra logs show that gossip tasks on nodes are not proceeding (number of pending tasks keeps increasing). Jstack suggests that a gossip task is blocked on hints dispatch (I can provide traces if this is not obvious). Because the cluster is large and there are a lot of hints, this is taking a while. 
> On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint files for the decommissioned node. Documentation seems to suggest that these hints should be deleted during "nodetool decommission", but it does not seem to be the case here. This is the bug being reported.
> To recover from this scenario, if I manually delete hint files on the nodes, the hints dispatcher threads throw a bunch of exceptions and the decommissioned node is now in state "DL" (perhaps it missed some gossip messages?). The node is still in my "system.peers" table
> Restarting Cassandra on all nodes after this step does not fix the issue (the node remains in the peers table). In fact, after this point the decommissioned node is in state "DN"


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)