cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Haggerty (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5780) nodetool status and ring report incorrect/stale information after decommission
Date Sat, 14 Dec 2013 13:51:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848357#comment-13848357
] 

Peter Haggerty commented on CASSANDRA-5780:
-------------------------------------------

We just ran into this again when a node rebooted and came back up thinking everything was
fine, but every other node in the ring disagreed. This was resolved by our normal "manual
restart" procedure where we stop thrift, gossip, flush the node, drain the node then restart
cassandra but it definitely caused some confusion for "nodetool status" and "nodetool info"
to report that the node was up and a working part of the cluster when in fact it wasn't.

The nodes in this state definitely do *not* make it clear that they are not part of the cluster
anymore.

> nodetool status and ring report incorrect/stale information after decommission
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5780
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5780
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Peter Haggerty
>            Priority: Trivial
>              Labels: lhf, ponies
>
> Cassandra 1.2.6 ring of 12 instances, each with 256 tokens.
> Decommission 3 of the 12 nodes, one after another resulting a 9 instance ring.
> The 9 instances of cassandra that are in the ring all correctly report nodetool status
information for the ring and have the same data.
> After the first node is decommissioned:
> "nodetool status" on "decommissioned-1st" reports 11 nodes
> After the second node is decommissioned:
> "nodetool status" on "decommissioned-1st" reports 11 nodes
> "nodetool status" on "decommissioned-2nd" reports 10 nodes
> After the second node is decommissioned:
> "nodetool status" on "decommissioned-1st" reports 11 nodes
> "nodetool status" on "decommissioned-2nd" reports 10 nodes
> "nodetool status" on "decommissioned-3rd" reports 9 nodes
> The storage load information is similarly stale on the various decommissioned nodes.
The nodetool status and ring commands continue to return information as if they were part
of a cluster and they appear to return the last information that they saw.
> In contrast the nodetool info command fails with an exception, which isn't ideal but
at least indicates that there was a failure rather than returning stale information.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message