Jim Cistaro
Subject Re: replace dead node? " token -1 "
Wed, 15 Aug 2012 17:31:25 GMT
I have not viewed the code, but it would seem that replace_token does not "remove token", because
that would spread the data and then "unspread" it when the new node joins.  But like I said,
I have not read the code.

>From our standpoint, we want the tokens to stay the same when possible due to the way
our backups are tagged.

As for "old nodes staying around", you are correct, we never remove token (because we replace
node for that same token) and the gossip-ing keeps knowledge of that old node.

Sorry if this explanation is not that clear.  This issue is a little unclear and we are dealing
wth it from an ops POV rather than a dev understanding of the code.

As for the attractiveness of the T-1 approach.  If you don't have the need for token consistency,
then it might be more attractive for you.  We don't use it, so I cannot say if that approach
has any issues, etc.


Yang
Reply-To: <<>>
Wed, 15 Aug 2012 02:00:55 -0700
To: <<>>
Subject: Re: replace dead node? " token -1 "

considering there is this minor "old node hanging around" issue, would the old T-1 approach
sound more attractive?
that way you don't necessarily have to remove the dead token immediately, but could come back
the next day, or even a week  later. T-1 would behave essentially the same in terms of partitioning
the data range.


Yang
ok,  I see, the cassandra.replace_token  setting essentially  executes the manual removeToken
step. so the dead node should be removed.

is this the "old node hanging around" issue that you described?
looks this JIRA is fixed in 1.0x already, so it's another issue?


Yang

thanks a  lot for the info.

when you say "old nodes sometimes hanging around as "unreachable nodes" when describing cluster",
you mean after the new node boots up and assumes ownership of the same token, you have not
manually run nodetool removeToken, right? this kind of makes sense --- since it seems that
the membership being gossiped around still contains the dead node (which is represented by
a different AWS internal ip), though the same token is being associated to both dead and new
nodes ??? I'm getting a bit confused here....

I think previously when I boot up a new node with the same token, while the old host is dead,
the other nodes on the
ring says something like "this token xxxxxx is already owned by old_node_ip_here,...... ".
 I don't remember exactly the behavior now, that's why I'm cautious of using T instead of

I'm doing more tests to confirm this behavior


Jim Cistaro
We use priam to replace nodes using replace_token.  We do see some issues (currently on 1.0.9,
as well as earlier versions) with replace_token.

Apparently there are some known issues with replace_token.  We have experienced the old nodes
sometimes hanging around as "unreachable nodes" when describing cluster.  Also, we have experienced
problems where moving the new node causes the old "replaced" node to resurrect for the token
that was outgoing during the move.

You can notice these old nodes hanging around in logs.  You will see messages like: (line 1020) Nodes /<old_ip> and /<new_ip> have the same token
NNNNNNNNNN.  Ignoring /<old_ip>.

We have then had to "nt removetoken" to clean things up after moves.  We are also investigating
using method unsafeAssassinateEndpoint (via jmx) to clean up some of the unreachables.

Like I said, we still use replace_token, but be aware of these possible inconveniences.

Jim Cistaro
Netflix Cassandra Operations

Yang
Reply-To: <<>>
Tue, 14 Aug 2012 21:58:30 -0700
To: <<>>
Subject: Re: replace dead node? " token -1 "

thanks Aaron, it has been a while since i last checked the code,  I'll read it to understand
it more

aaron morton
Using this method, when choosing the new <Token>, should we still use the T-1 ?
replace_token is used when you want to replace a node that is dead. In this case the dead
node will be identified by its token.

if so, would the duplicate token (same token but different ip) cause problems?
If the nodes are bootstrapping an error is raised.
Otherwise the token ownership is passed to the new node.


Aaron Morton
Freelance Developer

Yang

previously when a node dies, I remember the documents describes that it's better to assign
T-1 to the new node,
where T was the token of the dead node.

the new doc for 1.x here

shows a new way to  pass in cassandra.replace_token=<Token>
for the new node.
Using this method, when choosing the new <Token>, should we still use the T-1 ?

Also in Priam code:

line 148, it does not seem that Priam does the "-1" thing, but assigns the original token
T to the new node.
if so, would the duplicate token (same token but different ip) cause problems?


