cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jaakko Laine (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-572) handle old gossip properly
Date Thu, 03 Dec 2009 02:47:27 GMT


Jaakko Laine commented on CASSANDRA-572:

Funny thing, I was just thinking about the same thing during breakfast. Have to eat more often

The problem with this is that handling state changes will become somewhat more complex as
we must be prepared to handle transitions between any two states in any order. Current gossip
model leaves a trace of what the node has node, and even in the face of network partitions
we can "play back" the transitions when they eventually arrive. That is, if a node moves,
we will still see LEAVING, LEFT, BOOTSTRAPPING and NORMAL and construct token metadata according
to that. If we only have one value to represent node's current state, we might go from, say
NORMAL to NORMAL, or even LEFT to LEAVING without seeing any of the intermediate steps. Of
course this can be done, but needs extra care. Don't know how much, though. Might very well
be that in the end this would be better than the current way.

But even this would not remove the need to handle old application state correctly. If a node
enters the ring when another node is just LEAVING or LEFT, that state will be the first one
to be seen, and it must be ignored since there is nothing that can be done if NORMAL has not
been seen. I think the real cause is there in any case, so we can't avoid fixing the symptoms
that arrive with it.

I'll try this out now that I'm working on the gossiping part anyway so we'll have some more
insight on what it would look like.

> handle old gossip properly
> --------------------------
>                 Key: CASSANDRA-572
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jaakko Laine
>             Fix For: 0.5
>         Attachments: 572-handle-old-gossip.patch
> (1) If a node has been moving in the ring, further bootstraps by other nodes will cause
errors as they are handling STATE_LEAVING gossip without having such member in token metadata.
> (2) When a node bootstraps, it handles all ep states in the order they happen to arrive.
If the first one to arrive has moved in the past (that is, it has STATE_LEAVING in its ep
state), getNaturalEndpoint will throw ArrayIndexOutOfBounds exception as sortedTokens.size()
== 0.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message