cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2371) Removed/Dead Node keeps reappearing
Date Mon, 11 Apr 2011 16:12:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018412#comment-13018412
] 

Jonathan Ellis commented on CASSANDRA-2371:
-------------------------------------------

bq. The problem with adding it, however, is that SS.handleStateLeft gets called every gossip
round and runs through the hint removal process

But it only runs hint removal once per node, right?  Or is "onChange" not an accurate method
name?

bq. One option may be to check if the node is locally persisted and if not, just ignore the
message since we never knew about it anyway.

Locally persisted... with a token in SystemTable?  Ignore the ... hint message?

bq. Another is to just not remove hints when we see the LEFT state

We should issue a delete to the hints row, but we should not force a major compaction. The
rule of thumb is, avoiding inflicting a performance hit on the cluster trumps immediate disk
space cleanup.

> Removed/Dead Node keeps reappearing
> -----------------------------------
>
>                 Key: CASSANDRA-2371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.7.3, 0.7.4
>         Environment: Large Amazon EC2 instances. Ubuntu 10.04.2 
>            Reporter: techlabs
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 2371.txt
>
>
> The removetoken option does not seem to work. The original node 10.240.50.63 comes back
into the ring, even after the EC2 instance is no longer in existence. Originally I tried to
add a new node 10.214.103.224 with the same token, but there were some complications with
that. I have pasted below all the INFO log entries found with greping the system log files.
> Seems to be a similar issue seen with http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ghost-node-showing-up-in-the-ring-td6198180.html

> INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224
and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224
is the new owner
>  INFO [GossipStage:1] 2011-03-16 17:26:51,083 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:27:24,767 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:29:30,191 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:31:35,609 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-19 17:33:39,440 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.214.103.224
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [GossipStage:1] 2011-03-10 03:52:37,299 Gossiper.java (line 608) Node /10.240.50.63
is now part of the cluster
>  INFO [GossipStage:1] 2011-03-10 03:52:37,545 Gossiper.java (line 600) InetAddress /10.240.50.63
is now UP
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,168 HintedHandOffManager.java (line 304)
Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-10 03:53:36,169 HintedHandOffManager.java (line 360)
Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [GossipStage:1] 2011-03-15 23:23:43,770 Gossiper.java (line 623) Node /10.240.50.63
has restarted, now UP again
>  INFO [GossipStage:1] 2011-03-15 23:23:43,771 StorageService.java (line 726) Node /10.240.50.63
state jump to normal
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,957 HintedHandOffManager.java (line 304)
Started hinted handoff for endpoint /10.240.50.63
>  INFO [HintedHandoff:1] 2011-03-15 23:28:48,958 HintedHandOffManager.java (line 360)
Finished hinted handoff of 0 rows to endpoint /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-15 23:37:25,071 Gossiper.java (line 226) InetAddress
/10.240.50.63 is now dead.
>  INFO [GossipStage:1] 2011-03-16 00:54:31,590 StorageService.java (line 745) Nodes /10.214.103.224
and /10.240.50.63 have the same token 95704415696513900000000000000000000000.  /10.214.103.224
is the new owner
>  WARN [GossipStage:1] 2011-03-16 00:54:31,590 TokenMetadata.java (line 115) Token 95704415696513900000000000000000000000
changing ownership from /10.240.50.63 to /10.214.103.224
>  INFO [GossipStage:1] 2011-03-18 23:37:09,158 Gossiper.java (line 610) Node /10.240.50.63
is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 Gossiper.java (line 610) Node /10.240.50.63
is now part of the cluster
>  INFO [GossipStage:1] 2011-03-21 23:37:10,421 StorageService.java (line 726) Node /10.240.50.63
state jump to normal
>  INFO [GossipStage:1] 2011-03-23 17:22:55,520 StorageService.java (line 865) Removing
token 95704415696513900000000000000000000000 for /10.240.50.63
>  INFO [ScheduledTasks:1] 2011-03-23 17:22:55,521 HintedHandOffManager.java (line 210)
Deleting any stored hints for 10.240.50.63

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message