Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 03E36200B9A for ; Fri, 7 Oct 2016 20:56:24 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 02AA1160AC6; Fri, 7 Oct 2016 18:56:24 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 47A9D160AE9 for ; Fri, 7 Oct 2016 20:56:23 +0200 (CEST) Received: (qmail 93670 invoked by uid 500); 7 Oct 2016 18:56:22 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 93652 invoked by uid 99); 7 Oct 2016 18:56:22 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Oct 2016 18:56:22 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 1F3B22C2A69 for ; Fri, 7 Oct 2016 18:56:22 +0000 (UTC) Date: Fri, 7 Oct 2016 18:56:22 +0000 (UTC) From: "Yabin Meng (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 07 Oct 2016 18:56:24 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15555952#comment-15555952 ] Yabin Meng commented on CASSANDRA-10371: ---------------------------------------- Hi, I assume 2.2.8 should have this issue fixed. But in my CCM based 3-node cluster test, I still see decommissioned node showing up in gossip. Below is what I did. Is there anything that I miss here? 1) Bring up a CCM based 3 node cluster (version 2.2.8) 2) Decommission node3 (ccm node3 nodetool decommission) 3) On node1, run "nodetool describecluster" and got schema disagreement as below. Double checked gossip info (ccm node1 nodetool gossipinfo) and still see node3 info. Cluster Information: Name: c2.2.8 Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 19d024c9-9762-35a0-931c-515c9d9d08a6: [127.0.0.1, 127.0.0.2] UNREACHABLE: [127.0.0.3] > Decommissioned nodes can remain in gossip > ----------------------------------------- > > Key: CASSANDRA-10371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10371 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata > Reporter: Brandon Williams > Assignee: Joel Knighton > Priority: Minor > Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4 > > > This may apply to other dead states as well. Dead states should be expired after 3 days. In the case of decom we attach a timestamp to let the other nodes know when it should be expired. It has been observed that sometimes a subset of nodes in the cluster never expire the state, and through heap analysis of these nodes it is revealed that the epstate.isAlive check returns true when it should return false, which would allow the state to be evicted. This may have been affected by CASSANDRA-8336. -- This message was sent by Atlassian JIRA (v6.3.4#6332)