From commits-return-208975-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Sun Apr 15 06:11:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 43A8218067B for ; Sun, 15 Apr 2018 06:11:08 +0200 (CEST) Received: (qmail 7446 invoked by uid 500); 15 Apr 2018 04:11:06 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 7371 invoked by uid 99); 15 Apr 2018 04:11:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 Apr 2018 04:11:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C25AA1807B4 for ; Sun, 15 Apr 2018 04:11:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.311 X-Spam-Level: X-Spam-Status: No, score=-110.311 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 688n5ESuPxFk for ; Sun, 15 Apr 2018 04:11:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D71825F2F0 for ; Sun, 15 Apr 2018 04:11:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id F1B4CE0354 for ; Sun, 15 Apr 2018 04:11:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 57AA521208 for ; Sun, 15 Apr 2018 04:11:00 +0000 (UTC) Date: Sun, 15 Apr 2018 04:11:00 +0000 (UTC) From: "Dikang Gu (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-13348?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D16438577#comment-16438577 ]=20 Dikang Gu commented on CASSANDRA-13348: --------------------------------------- I'm unable to reproduce=C2=A0this particular issue in our test or productio= n environment.=C2=A0I think we need a strong consistent membership (CASSAND= RA-9667) to truly avoid race condition during token allocation. > Duplicate tokens after bootstrap > -------------------------------- > > Key: CASSANDRA-13348 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1334= 8 > Project: Cassandra > Issue Type: Bug > Reporter: Tom van der Woerdt > Assignee: Dikang Gu > Priority: Blocker > Fix For: 3.0.x > > > This one is a bit scary, and probably results in data loss. After a boots= trap of a few new nodes into an existing cluster, two new nodes have chosen= some overlapping tokens. > In fact, of the 256 tokens chosen, 51 tokens were already in use on the o= ther node. > Node 1 log : > {noformat} > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 Storage= Service.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 Storage= Service.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 Storage= Service.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 Storage= Service.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 Storage= Service.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 Storage= Service.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 TokenAl= location.java:61 - Selected tokens [............, 2959334889475814712, 3727= 103702384420083, 7183119311535804926, 6013900799616279548, -122213532485176= 1575, 1645259890258332163, -1213352346686661387, 7604192574911909354] > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAl= location.java:65 - Replicated node load in datacentre before allocation max= 1.00 min 1.00 stddev 0.0000 > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAl= location.java:66 - Replicated node load in datacentre after allocation max = 1.00 min 1.00 stddev 0.0000 > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAl= location.java:70 - Unexpected growth in standard deviation after allocation= . > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 Storage= Service.java:1160 - JOINING: sleeping 30000 ms for pending range setup > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 Storage= Service.java:1160 - JOINING: Starting to bootstrap... > {noformat} > Node 2 log: > {noformat} > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 Storage= Service.java:971 - Joining ring by operator request > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 Storage= Service.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 Storage= Service.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 Storage= Service.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 Storage= Service.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 Storage= Service.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 Storage= Service.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 TokenAl= location.java:61 - Selected tokens [......, 2890709530010722764, -241600672= 2819773829, -5820248611267569511, -5990139574852472056, 1645259890258332163= , 9135021011763659240, -5451286144622276797, 7604192574911909354] > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 TokenAl= location.java:65 - Replicated node load in datacentre before allocation max= 1.02 min 0.98 stddev 0.0000 > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 TokenAl= location.java:66 - Replicated node load in datacentre after allocation max = 1.00 min 1.00 stddev 0.0000 > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 Storage= Service.java:1160 - JOINING: sleeping 30000 ms for pending range setup > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 Storage= Service.java:1160 - JOINING: Starting to bootstrap... > {noformat} > eg. 7604192574911909354 has been chosen by both. > The joins were eight days apart, so I don't think it's a race :) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org For additional commands, e-mail: commits-help@cassandra.apache.org