cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremiah Jordan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap
Date Sat, 29 Jul 2017 01:13:01 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105957#comment-16105957
] 

Jeremiah Jordan commented on CASSANDRA-13348:
---------------------------------------------

I have also seen nodes have partial cluster state views if auto bootstrap was false, as in
the auto bootstrap false case we do not wait for things to settle for as long, so in a large
cluster or one with datacenters that have a lot of latency they come up before having a full
view of things. So I could see this possibly happening in that case as well. The fix from
CASSANDRA-13700 helps with that, as those conditions caused the bug there to cause even more
trouble with gossip settling.

> Duplicate tokens after bootstrap
> --------------------------------
>
>                 Key: CASSANDRA-13348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Tom van der Woerdt
>            Assignee: Dikang Gu
>            Priority: Blocker
>             Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap of a few
new nodes into an existing cluster, two new nodes have chosen some overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 StorageService.java:1160
- JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 StorageService.java:1160
- JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 StorageService.java:1160
- JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 StorageService.java:1160
- JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 StorageService.java:1160
- JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 StorageService.java:1160
- JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 TokenAllocation.java:61
- Selected tokens [............, 2959334889475814712, 3727103702384420083, 7183119311535804926,
6013900799616279548, -1222135324851761575, 1645259890258332163, -1213352346686661387, 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAllocation.java:65
- Replicated node load in datacentre before allocation max 1.00 min 1.00 stddev 0.0000
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAllocation.java:66
- Replicated node load in datacentre after allocation max 1.00 min 1.00 stddev 0.0000
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAllocation.java:70
- Unexpected growth in standard deviation after allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 StorageService.java:1160
- JOINING: sleeping 30000 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 StorageService.java:1160
- JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 StorageService.java:971
- Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160
- JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160
- JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160
- JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160
- JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 StorageService.java:1160
- JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 StorageService.java:1160
- JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 TokenAllocation.java:61
- Selected tokens [......, 2890709530010722764, -2416006722819773829, -5820248611267569511,
-5990139574852472056, 1645259890258332163, 9135021011763659240, -5451286144622276797, 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 TokenAllocation.java:65
- Replicated node load in datacentre before allocation max 1.02 min 0.98 stddev 0.0000
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 TokenAllocation.java:66
- Replicated node load in datacentre after allocation max 1.00 min 1.00 stddev 0.0000
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 StorageService.java:1160
- JOINING: sleeping 30000 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 StorageService.java:1160
- JOINING: Starting to bootstrap...
> {noformat}
> eg. 7604192574911909354 has been chosen by both.
> The joins were eight days apart, so I don't think it's a race :)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message