cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sumsion (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5780) nodetool status and ring report incorrect/stale information after decommission
Date Mon, 28 Sep 2015 18:36:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933745#comment-14933745
] 

John Sumsion commented on CASSANDRA-5780:
-----------------------------------------

The only thing I wouldn't want to have happen is to accidentally issue some kind of truncate
that in a race condition inadvertently gets replicated to the entire cluster.  I don't know
the cassandra codebase enough to understand whether that risk exists when calling {{ColumnFamilyStore.truncateBlocking()}}.
 From what I can tell, I think it's likely pretty safe because once you get down to StorageService,
there is no cross-cluster effect of actions taken at that level.

Can anyone reply who knows better what cross-cluster effects {{truncateBlocking()}} might
have?

The reason I don't have that concern with the 'system' keyspace is that it is never replicated.

Actually, looking into  {{ColumnFamilyStore.truncateBlocking()}} makes me think that my proposed
changes will blow up half-way through because a side-effect of truncating a table is writing
back a "truncated at" record to 'system.local' table (which we just truncated).  I guess I
need to run ccm with a local-built cassandra and try decomissioning to see what happens (not
sure how to do that).

> nodetool status and ring report incorrect/stale information after decommission
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5780
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5780
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Peter Haggerty
>            Priority: Trivial
>              Labels: lhf, ponies, qa-resolved
>             Fix For: 2.1.x
>
>
> Cassandra 1.2.6 ring of 12 instances, each with 256 tokens.
> Decommission 3 of the 12 nodes, one after another resulting a 9 instance ring.
> The 9 instances of cassandra that are in the ring all correctly report nodetool status
information for the ring and have the same data.
> After the first node is decommissioned:
> "nodetool status" on "decommissioned-1st" reports 11 nodes
> After the second node is decommissioned:
> "nodetool status" on "decommissioned-1st" reports 11 nodes
> "nodetool status" on "decommissioned-2nd" reports 10 nodes
> After the second node is decommissioned:
> "nodetool status" on "decommissioned-1st" reports 11 nodes
> "nodetool status" on "decommissioned-2nd" reports 10 nodes
> "nodetool status" on "decommissioned-3rd" reports 9 nodes
> The storage load information is similarly stale on the various decommissioned nodes.
The nodetool status and ring commands continue to return information as if they were part
of a cluster and they appear to return the last information that they saw.
> In contrast the nodetool info command fails with an exception, which isn't ideal but
at least indicates that there was a failure rather than returning stale information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message