cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jaakko Laine (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-572) handle old gossip properly
Date Thu, 03 Dec 2009 08:42:20 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jaakko Laine updated CASSANDRA-572:
-----------------------------------

    Attachment: use-same-APstate-for-all-node-state-gossip.patch

OK, here's a patch that uses same state name (NODE_STATE) to gossip all movement information.
Format is (BOOTSTRAPPING|NORMAL|LEAVING|LEFT)|token.

The main things caused by this modification to the state machine were:
(1) When a node is bootstrapping, we should clear pending ranges for this endpoint, as well
as remove it from token metadata. These checks are not strictly necessary (I think), but are
there to help transition from LEAVING -> BOOTSTRAPPING in case we missed LEFT due to network
partition.
(2) For handleStateLeaving and handleStateLeft remove pending ranges for this endpoint before
doing anything else. If we missed NORMAL, there might be obsolete pending ranges from BOOTSTRAP.
Distant possibility, but possibility nonetheless.

Following additional check is not directly related to gossip format change and could happen
even using the current model. This is a very unlikely event, but in a large (say, 200+ nodes)
multi-DC cluster with lots of node movement, this could very well happen even with relatively
short DC-to-DC network outage:
(1) Added a check to handleStateLeaving and handleStateLeft for the case that a node has made
NORMAL -> LEAVING -> LEFT -> BOOTSTRAP -> NORMAL -> LEAVING [->LEFT] movement
cycle without us seeing the intermediate stages. In this case we have information for the
old token and now the node is leaving _new_ token. We cannot simply assert this, as it is
possible this happens.

Now of course this already touches the subject what conditions we must take care of and what
should be left to operators to handle. Some of them (like removing all references to the endpoint
before continuing to handle bootstrapping) are questionable and might relax safety precautions,
but if we do not do that, a modest 30s network outage might cause us not to see STATE_LEFT
and we'd end up having strange pending ranges.

I don't expect this patch to be included as it is, but let's see what people think of this
gossip change and then discuss what checks should be made :)


> handle old gossip properly
> --------------------------
>
>                 Key: CASSANDRA-572
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-572
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jaakko Laine
>             Fix For: 0.5
>
>         Attachments: 572-handle-old-gossip.patch, use-same-APstate-for-all-node-state-gossip.patch
>
>
> (1) If a node has been moving in the ring, further bootstraps by other nodes will cause
errors as they are handling STATE_LEAVING gossip without having such member in token metadata.
> (2) When a node bootstraps, it handles all ep states in the order they happen to arrive.
If the first one to arrive has moved in the past (that is, it has STATE_LEAVING in its ep
state), getNaturalEndpoint will throw ArrayIndexOutOfBounds exception as sortedTokens.size()
== 0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message