cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <>
Subject [jira] Created: (CASSANDRA-2072) Race condition during decommission
Date Fri, 28 Jan 2011 00:07:43 GMT
Race condition during decommission

                 Key: CASSANDRA-2072
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.7.0
            Reporter: Brandon Williams
            Priority: Minor

Occasionally when decommissioning a node, there is a race condition that occurs where another
node will never remove the token and thus propagate it again with a state of down.  With CASSANDRA-1900
we can solve this, but it shouldn't occur in the first place.

Given nodes A, B, and C, if you decommission B it will stream to A and C.  When complete,
B will decommission and receive this stacktrace:

ERROR 00:02:40,282 Fatal exception in thread Thread[Thread-5,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
        at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(
        at java.util.concurrent.ThreadPoolExecutor.reject(
        at java.util.concurrent.ThreadPoolExecutor.execute(

At this point A will show it is removing B's token, but C will not and instead it's failure
detector will report that B is dead, and nodetool ring on C shows A in a leaving/down state.
 In another gossip round, C will propagate this state back to A.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message