cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jaakko Laine (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-603) pending range collision between nodes
Date Fri, 11 Dec 2009 11:35:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789250#action_12789250
] 

Jaakko Laine edited comment on CASSANDRA-603 at 12/11/09 11:33 AM:
-------------------------------------------------------------------

Patch attached. Modifications:

Keep track of booting and leaving tokens and calculate pending ranges again every time there
is status change. This will keep them up to date. To ensure that pending ranges cover node's
final range, following reasoning is used in calculation:

(1) When in doubt, it is better to write too much to a node than too little. That is, if there
are multiple nodes moving, calculate the biggest ranges a node could have. Cleaning up unneeded
data afterwards is better than missing writes during movement.

(2) When a node leaves, ranges for other nodes can only grow (a node might get additional
ranges, but it will not lose any of its current ranges as a result of a leave). Therefore
we will first remove _all_ leaving tokens for the sake of calculation and then check what
ranges would go where if all nodes are to leave. This way we get the biggest possible ranges
with regard current leave operations, covering all subsets of possible final range values.

(3) When a node bootstraps, ranges of other nodes can only get smaller. Without doing complex
calculations to see if multiple bootstraps overlap, we simply base calculations on the same
token ring used before (reflecting situation after all leave operations have completed). Bootstrapping
nodes will be added and removed one by one to that metadata and checked what their ranges
would be. This will give us the biggest possible ranges the node could have. It might be that
other bootstraps make our actual final ranges smaller, but it does not matter as we can clean
up the data afterwards.

Bootstrap Token collision (old pending range collision) is thrown now only if bootstrap tokens
are identical.

Calculating pending ranges is rather heavy operation, but since it is done only once when
a node changes state in the cluster, it should be manageable.

This patch would make #572 cleaner to do, since we now know which way a node is going and
can update pending ranges according to any changes.

Edit: this also removes nodeprobe cancelpendingranges. That would be pointless now. If there
is a node/token that has not finished move operation, nodeprobe removetoken will do the trick.

      was (Author: jaakko):
    Patch attached. Modifications:

- Keep track of booting and leaving tokens and calculate pending ranges again every time there
is status change. This will keep them up to date. To ensure that pending ranges cover node's
final range, following reasoning is used in calculation:

(1) When in doubt, it is better to write too much to a node than too little. That is, if there
are multiple nodes moving, calculate the biggest ranges a node could have. Cleaning up unneeded
data afterwards is better than missing writes during movement.

(2) When a node leaves, ranges for other nodes can only grow (a node might get additional
ranges, but it will not lose any of its current ranges as a result of a leave). Therefore
we will first remove _all_ leaving tokens for the sake of calculation and then check what
ranges would go where if all nodes are to leave. This way we get the biggest possible ranges
with regard current leave operations, covering all subsets of possible final range values.

(3) When a node bootstraps, ranges of other nodes can only get smaller. Without doing complex
calculations to see if multiple bootstraps overlap, we simply base calculations on the same
token ring used before (reflecting situation after all leave operations have completed). Bootstrapping
nodes will be added and removed one by one to that metadata and checked what their ranges
would be. This will give us the biggest possible ranges the node could have. It might be that
other bootstraps make our actual final ranges smaller, but it does not matter as we can clean
up the data afterwards.

Bootstrap Token collision (old pending range collision) is thrown now only if bootstrap tokens
are identical.

Calculating pending ranges is rather heavy operation, but since it is done only once when
a node changes state in the cluster, it should be manageable.

This patch would make #572 cleaner to do, since we now know which way a node is going and
can update pending ranges according to any changes.

  
> pending range collision between nodes
> -------------------------------------
>
>                 Key: CASSANDRA-603
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-603
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Chris Goffinet
>             Fix For: 0.5
>
>         Attachments: 603.patch
>
>
> We bootstrapped 5 nodes on the east coast from an existing cluster (5) on west. We waited
at least 60 seconds before starting up each node so it would start bootstrapping. We started
seeing these types of errors:
>  INFO [GMFD:1] 2009-12-04 01:45:42,065 Gossiper.java (line 568) Node /X.X.X.140 has now
joined.
> ERROR [GMFD:1] 2009-12-04 01:46:14,371 DebuggableThreadPoolExecutor.java (line 127) Error
in ThreadPoolExecutor
> java.lang.RuntimeException: pending range collision between /X.X.X.139 and /X.X.X.140
>         at org.apache.cassandra.locator.TokenMetadata.addPendingRange(TokenMetadata.java:242)
>         at org.apache.cassandra.service.StorageService.updateBootstrapRanges(StorageService.java:481)
>         at org.apache.cassandra.service.StorageService.onChange(StorageService.java:402)
>         at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:692)
>         at org.apache.cassandra.gms.Gossiper.applyApplicationStateLocally(Gossiper.java:657)
>         at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:610)
>         at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(Gossiper.java:978)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:38)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> ERROR [GMFD:1] 2009-12-04 01:46:14,378 CassandraDaemon.java (line 71) Fatal exception
in thread Thread[GMFD:1,5,main]   
> java.lang.RuntimeException: pending range collision between /X.X.X.139 and /X.X.X.140
> java.lang.RuntimeException: pending range collision between /X.X.X.139 and /X.X.X.140
>         at org.apache.cassandra.locator.TokenMetadata.addPendingRange(TokenMetadata.java:242)
>         at org.apache.cassandra.service.StorageService.updateBootstrapRanges(StorageService.java:481)
>         at org.apache.cassandra.service.StorageService.onChange(StorageService.java:402)
>         at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:692)
>         at org.apache.cassandra.gms.Gossiper.applyApplicationStateLocally(Gossiper.java:657)
>         at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:610)
>         at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(Gossiper.java:978)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:38)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message