cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing
Date Thu, 07 Apr 2011 00:06:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016627#comment-13016627
] 

Brandon Williams commented on CASSANDRA-2371:
---------------------------------------------

There is a second problem here.  We don't populate the gossiper's application state to LEFT
for removetoken, only for decommission.  The problem with adding it, however, is that SS.handleStateLeft
gets called every gossip round and runs through the hint removal process.  One option may
be to check if the node is locally persisted and if not, just ignore the message since we
never knew about it anyway.  Another is to just not remove hints when we see the LEFT state,
because they'll expire anyway and large unaccessed rows aren't a problem, so this seems like
a throwback from the <=0.6 days.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back
into the ring, even after the EC2 instance is no longer in existence. Originally I tried to
add a new node 10.214.103.224 with the same token, but there were some complications with
that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html

> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224
and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224
is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63
is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63
is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304)
Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360)
Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63
has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63
state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304)
Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360)
Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress
/10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224
and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224
is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000
changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63
is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63
is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63
state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210)
Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message