Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id ED134200ABE for ; Fri, 20 May 2016 19:09:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id EBFB0160A25; Fri, 20 May 2016 17:09:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 21B5F160A24 for ; Fri, 20 May 2016 19:09:13 +0200 (CEST) Received: (qmail 6480 invoked by uid 500); 20 May 2016 17:09:13 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 6279 invoked by uid 99); 20 May 2016 17:09:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2016 17:09:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 26CFE2C1F5C for ; Fri, 20 May 2016 17:09:13 +0000 (UTC) Date: Fri, 20 May 2016 17:09:13 +0000 (UTC) From: "Sam Tunnicliffe (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-11038) Is node being restarted treated as node joining? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 20 May 2016 17:09:15 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-11038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293680#comment-15293680 ] Sam Tunnicliffe edited comment on CASSANDRA-11038 at 5/20/16 5:08 PM: ---------------------------------------------------------------------- Pushed branches with fixes for 2.2/3.0/3.7/trunk - though the fix merges forward cleanly except for conflicts where I've cleaned up imports. Basically, these preserve the existing behaviour of delivering both {{NEW_NODE}} and {{UP}} events when a node first joins the cluster & of delaying both until after the node becomes available for clients. The erroneous {{NEW_NODE}} when a known node is restarted has been removed. The tracking of pushed notifications in {{EventNotifier}} is still necessary at the moment (because [reasons|https://issues.apache.org/jira/browse/CASSANDRA-7816?focusedCommentId=14346387&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14346387]), but they will go away with CASSANDRA-9156. See CASSANDRA-11731 for some related discussion. dtest branch [here|https://github.com/beobal/cassandra-dtest/tree/11038] ||branch||testall||dtest|| |[11038-2.2|https://github.com/beobal/cassandra/tree/11038-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-2.2-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-2.2-dtest]| |[11038-3.0|https://github.com/beobal/cassandra/tree/11038-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.0-dtest]| |[11038-3.7|https://github.com/beobal/cassandra/tree/11038-3.7]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.7-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.7-dtest]| |[11038-trunk|https://github.com/beobal/cassandra/tree/11038-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-trunk-dtest]| (so far I've only kicked off CI for the 2.2 branch, just in case there's some problem I didn't run into locally, will kick off the other jobs when that finishes). edit: pushed an additional commit to the 2.2 branch as I forgot to switch to java 7 during dev and accidentally included an 8ism. was (Author: beobal): Pushed branches with fixes for 2.2/3.0/3.7/trunk - though the fix merges forward cleanly except for conflicts where I've cleaned up imports. Basically, these preserve the existing behaviour of delivering both {{NEW_NODE}} and {{UP}} events when a node first joins the cluster & of delaying both until after the node becomes available for clients. The erroneous {{NEW_NODE}} when a known node is restarted has been removed. The tracking of pushed notifications in {{EventNotifier}} is still necessary at the moment (because [reasons|https://issues.apache.org/jira/browse/CASSANDRA-7816?focusedCommentId=14346387&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14346387]), but they will go away with CASSANDRA-9156. See CASSANDRA-11731 for some related discussion. dtest branch [here|https://github.com/beobal/cassandra-dtest/tree/11038] ||branch||testall||dtest|| |[11038-2.2|https://github.com/beobal/cassandra/tree/11038-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-2.2-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-2.2-dtest]| |[11038-3.0|https://github.com/beobal/cassandra/tree/11038-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.0-dtest]| |[11038-3.7|https://github.com/beobal/cassandra/tree/11038-3.7]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.7-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.7-dtest]| |[11038-trunk|https://github.com/beobal/cassandra/tree/11038-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-trunk-dtest]| (so far I've only kicked off CI for the 2.2 branch, just in case there's some problem I didn't run into locally, will kick off the other jobs when that finishes). > Is node being restarted treated as node joining? > ------------------------------------------------ > > Key: CASSANDRA-11038 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11038 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata > Reporter: cheng ren > Assignee: Sam Tunnicliffe > Priority: Minor > Fix For: 2.2.x, 3.0.x, 3.x > > > Hi, > What we found recently is that every time we restart a node, all other nodes in the cluster treat the restarted node as a new node joining and issue node joining notification to clients. We have traced the code path being hit when a peer node detected a restarted node: > src/java/org/apache/cassandra/gms/Gossiper.java > {code} > private void handleMajorStateChange(InetAddress ep, EndpointState epState) > { > if (!isDeadState(epState)) > { > if (endpointStateMap.get(ep) != null) > logger.info("Node {} has restarted, now UP", ep); > else > logger.info("Node {} is now part of the cluster", ep); > } > if (logger.isTraceEnabled()) > logger.trace("Adding endpoint state for " + ep); > endpointStateMap.put(ep, epState); > // the node restarted: it is up to the subscriber to take whatever action is necessary > for (IEndpointStateChangeSubscriber subscriber : subscribers) > subscriber.onRestart(ep, epState); > if (!isDeadState(epState)) > markAlive(ep, epState); > else > { > logger.debug("Not marking " + ep + " alive due to dead state"); > markDead(ep, epState); > } > for (IEndpointStateChangeSubscriber subscriber : subscribers) > subscriber.onJoin(ep, epState); > } > {code} > subscriber.onJoin(ep, epState) ends up with calling onJoinCluster in Server.java > {code} > src/java/org/apache/cassandra/transport/Server.java > public void onJoinCluster(InetAddress endpoint) > { > server.connectionTracker.send(Event.TopologyChange.newNode(getRpcAddress(endpoint), server.socket.getPort())); > } > {code} > We have a full trace of code path and skip some intermedia function calls here for being brief. > Upon receiving the node joining notification, clients would go and scan system peer table to fetch the latest topology information. Since we have tens of thousands of client connections, scans from all of them put an enormous load to our cluster. > Although in the newer version of driver, client skips fetching peer table if the new node has already existed in local metadata, we are still curious why node being restarted is handled as node joining on server side? Did we hit a bug or this is the way supposed to be? Our old java driver version is 1.0.4 and cassandra version is 2.0.12. > Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)