From user-return-64402-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Wed Aug 28 14:33:55 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D1CB8180181 for ; Wed, 28 Aug 2019 16:33:54 +0200 (CEST) Received: (qmail 87821 invoked by uid 500); 28 Aug 2019 14:33:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 87811 invoked by uid 99); 28 Aug 2019 14:33:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Aug 2019 14:33:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7198EC08D4 for ; Wed, 28 Aug 2019 14:33:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.099 X-Spam-Level: * X-Spam-Status: No, score=1.099 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=rischmann.fr header.b=mvCrpf1p; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=z2tf9Bi0 Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Qh0QxC9PVBU1 for ; Wed, 28 Aug 2019 14:33:49 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=64.147.123.20; helo=wout4-smtp.messagingengine.com; envelope-from=vincent@rischmann.fr; receiver= Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 434A87DC33 for ; Wed, 28 Aug 2019 14:33:48 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id A2FBA487 for ; Wed, 28 Aug 2019 10:33:46 -0400 (EDT) Received: from imap2 ([10.202.2.52]) by compute1.internal (MEProxy); Wed, 28 Aug 2019 10:33:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rischmann.fr; h= mime-version:message-id:date:from:to:subject:content-type; s= fm3; bh=iENnpztj/a7K7sySTE2emgbsCmL1q09bWTpJ9IgOsX8=; b=mvCrpf1p iIYOB9pS4LHeDxNuGlZ6/+dlco0BFXftx/z4m3bpP04r2cKD/XfjNq0mU6p+T48Z Fro3ylz6hwws8gQUxpgYhB1Udf+N+/VAcq2uafNhQ/uBS//CjFuR2HO6CZ/5P5DD k1eOCYEyGSF75rzbtCdlDauCbYYSF+zuNoFGG78iyynhXerpHWmR0BUwaePNm5nv liKmE8F43dUlfbaCwBIBAlna3fwTaffahQNGY2hF4mRHpL+LJLdmQr3nxSh7nDqf h1t+tVbXeSLfA3E4y4yTzjKYihqRWrlDiLYAX5DIkSD/DwvzERB0t77uzaxCoO8x 0d6t3aGDCENWJw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; bh=iENnpztj/a7K7sySTE2emgbsCmL1q 09bWTpJ9IgOsX8=; b=z2tf9Bi0QTi0UB6dFSmmgZ5Z7V8jrx/AZX+VlqL5sLmHp CHa8XBiUst7WAYb+eodgtovw3Q9ZbCIEvMULrZtcVvSTcYYD+99zmqvC4dmEjUmB /nEU7KpQZBEQkPS8DJpip5V/vM6NJRWmWAqZr7rKEOgpOGdZ098E4o8whnCXZlu+ VgzwbnTQY25WeBcOvdB7nY5wgSvWzjKl8Iu1LCBpchkltme7AGZu/L/lCvUfTCA0 XthdvtsYABJ0lird7bnwTXNejxonzcGaIIenNRXueuj1d5JHwA5EG3Mqzi+vl8GL C7Fx1+JJuOFiK9bkFyTdEzddPN+PqotBpGkpkZjBQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduvddrudeitddgjeekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefofgggkfffhffvufgtsegrtderre erredtnecuhfhrohhmpedfgghinhgtvghnthcutfhishgthhhmrghnnhdfuceovhhinhgt vghnthesrhhishgthhhmrghnnhdrfhhrqeenucffohhmrghinhepthhhvghlrghsthhpih gtkhhlvgdrtghomhenucfrrghrrghmpehmrghilhhfrhhomhepvhhinhgtvghnthesrhhi shgthhhmrghnnhdrfhhrnecuvehluhhsthgvrhfuihiivgeptd X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id B0105E00A3; Wed, 28 Aug 2019 10:33:45 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.1.7-139-g73fcb67-fmstable-20190826v1 Mime-Version: 1.0 Message-Id: <07a5ebd1-0330-4072-820c-ff6cc0b14a53@www.fastmail.com> Date: Wed, 28 Aug 2019 16:33:24 +0200 From: "Vincent Rischmann" To: user@cassandra.apache.org Subject: gossipinfo contains two nodes dead for more than two years Content-Type: multipart/alternative; boundary=90dd1de351334223a928b76a993f77ce --90dd1de351334223a928b76a993f77ce Content-Type: text/plain Hi, while replacing a node in a cluster I saw this log: 2019-08-27 16:35:31,439 Gossiper.java:995 - InetAddress /10.15.53.27 is now DOWN it caught my attention because that ip address doesn't exist anymore in the cluster and it hasn't for a long time. After some reading I ran `nodetool gossipinfo` and I saw these entries which are nodes that don't exist anymore: /10.15.53.27 generation:1503480618 heartbeat:26970 STATUS:2:hibernate,true LOAD:26810:6.17363354147E11 SCHEMA:101:d21b1e47-f226-3417-8de7-5802518ae824 DC:10:DC1 RACK:12:RAC1 RELEASE_VERSION:6:2.1.18 INTERNAL_IP:8:10.15.53.27 RPC_ADDRESS:5:10.15.53.27 SEVERITY:26972:0.0 NET_VERSION:3:8 HOST_ID:4:2488fccc-108a-4a9d-ad43-5e8b8b6ee17b TOKENS:1: /10.5.1.16 generation:1503636779 heartbeat:324 STATUS:2:hibernate,true LOAD:204:2.601990697532E12 SCHEMA:14:d21b1e47-f226-3417-8de7-5802518ae824 DC:10:DC1 RACK:12:RAC1 RELEASE_VERSION:6:2.1.18 INTERNAL_IP:8:10.5.1.16 RPC_ADDRESS:5:10.5.1.16 SEVERITY:326:0.0 NET_VERSION:3:8 HOST_ID:4:2488fccc-108a-4a9d-ad43-5e8b8b6ee17b TOKENS:1: the generations are: - Wed, 23 Aug 2017 09:30:18 GMT - Fri, 25 Aug 2017 04:52:59 GMT I don't remember what we did at that time but it looks like we botched something while joining a node or something. After reading https://thelastpickle.com/blog/2018/09/18/assassinate.html I'm thinking of doing the following: * nodetool removenode 10.15.53.27 * if it doesn't work for some reason: nodetool assassinate 10.15.53.27 Since those nodes have been long dead and don't appear in system.peer I don't anticipate any problems but I'd like some confirmation that this can't break my cluster. Thanks ! --90dd1de351334223a928b76a993f77ce Content-Type: text/html Content-Transfer-Encoding: quoted-printable
Hi,

while replacing a node in a cluster I saw this lo= g:

    2019-08-27 16:35:31,4= 39 Gossiper.java:995 - InetAddress /10.15.53.27 is now DOWN

it caught my attention because that ip address doesn't = exist anymore in the cluster and it hasn't for a long time.

After some reading I ran `nodetool gossipinfo` and I sa= w these entries which are nodes that don't exist anymore:
=
    /10.15.53.27
  = ;    generation:1503480618
  &nbs= p;   heartbeat:26970
    &nb= sp; STATUS:2:hibernate,true
     = LOAD:26810:6.17363354147E11
     = ; SCHEMA:101:d21b1e47-f226-3417-8de7-5802518ae824
 &n= bsp;    DC:10:DC1
    &= nbsp; RACK:12:RAC1
      RELEASE_= VERSION:6:2.1.18
      INTERNAL_I= P:8:10.15.53.27
      RPC_ADDRESS= :5:10.15.53.27
      SEVERITY:269= 72:0.0
      NET_VERSION:3:8
<= /div>
      HOST_ID:4:2488fccc-108a-4a9d-ad= 43-5e8b8b6ee17b
      TOKENS:1:&l= t;hidden>
    /10.5.1.16
&= nbsp;     generation:1503636779
 =      heartbeat:324
   &= nbsp;  STATUS:2:hibernate,true
   &nbs= p;  LOAD:204:2.601990697532E12
   &nbs= p;  SCHEMA:14:d21b1e47-f226-3417-8de7-5802518ae824
&n= bsp;     DC:10:DC1
   &= nbsp;  RACK:12:RAC1
      RE= LEASE_VERSION:6:2.1.18
      INTE= RNAL_IP:8:10.5.1.16
      RPC_ADD= RESS:5:10.5.1.16
      SEVERITY:3= 26:0.0
      NET_VERSION:3:8
<= /div>
      HOST_ID:4:2488fccc-108a-4a9d-ad= 43-5e8b8b6ee17b
      TOKENS:1:&l= t;hidden>

the generations are:
=

- Wed, 23 Aug 2017 09:30:18 GMT
- Fri,= 25 Aug 2017 04:52:59 GMT

I don't remember = what we did at that time but it looks like we botched something while jo= ining a node or something.

After reading htt= ps://thelastpickle.com/blog/2018/09/18/assassinate.html I'm thinking= of doing the following:

* nodetool removen= ode 10.15.53.27
* if it doesn't work for some reason: node= tool assassinate 10.15.53.27

Since those no= des have been long dead and don't appear in system.peer I don't anticipa= te any problems but I'd like some confirmation that this can't break my = cluster.

Thanks !
--90dd1de351334223a928b76a993f77ce--