cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5565) Peer entry drops from system table silently when bootstrapping a node with an existing IP.
Date Wed, 15 May 2013 15:25:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658449#comment-13658449
] 

Brandon Williams commented on CASSANDRA-5565:
---------------------------------------------

So, thinking this through a bit more, I think I see what happened.  Let's forget about vnodes
and host ids and pretend you did this in 1.1.  You had a node, you took it down, wiped it,
used a different token, and bootstrapped it.  Now your ring is broken, because whatever token
it had before has been overwritten by the new token because the node has a newer generation.
 This is the exact same thing that happened here.  This is not the correct way to do this
and is an operational error, and has been since inception.  It's just how gossip works.
                
> Peer entry drops from system table silently when bootstrapping a node with an existing
IP.
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5565
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5565
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.4
>            Reporter: Rick Branson
>            Assignee: Brandon Williams
>
> It looks like CASSANDRA-5167 introduced a bit of a regression. I needed to rebuild the
data on a malfunctioning node by rebootstrapping it. I did this by cleaning the host and restarting
Cassandra. My plan was to remove the old hostID once it had successfully bootstrapped. 
> No errors were encountered, but the old host ID of the node before the wipe was completely
dropped from the peers table because they had the same IP address, and therefore the data
ranges were moved around. This resulted in a large number of CL.ONE reads coming back empty.
> There might be a better approach to this rebootstrap process, but it seems like it's
dangerous to just drop the peer from the table, especially without any kind of log message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message