cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Knighton (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-11740) Nodes have wrong membership view of the cluster
Date Tue, 05 Jul 2016 14:42:11 GMT


Joel Knighton commented on CASSANDRA-11740:

I don't have any great ideas here other than Jeremiah's suggestion above. When using GPFS,
there's a hierarchy of lookup that will happen.

First, we look for the information in gossip.

Then, if we have a fallback PropertyFileSnitch, we will use that.
If we don't, we'll first look in the system keyspace and then return defaults. The default

I have no ideas how these values could get in gossip or the system keyspace of the node without
having this configured in a file.

Since DC1/r1 are the default options given in the sample distributed
with Cassandra, it seems likely that this config file has not been removed from all nodes.

That said, if the information isn't present in gossip, there likely is something else that's
a problem. This could be better debugged with debug/trace level logs for some node A with
bad nodetool status output for node B as well as the debug/trace level logs for node B.

> Nodes have wrong membership view of the cluster
> -----------------------------------------------
>                 Key: CASSANDRA-11740
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Dikang Gu
>            Assignee: Joel Knighton
>             Fix For: 2.2.x, 3.x
> We have a few hundreds nodes across 3 data centers, and we are doing a few millions writes
per second into the cluster.
> The problem we found is that there are some nodes (>10) have very wrong view of the
> For example, we have 3 data centers A, B and C. On the problem nodes, in the output of
the 'nodetool status', it shows that ~100 nodes are not in data center A, B, or C. Instead,
it shows nodes are in DC1, and rack r1, which is very wrong. And as a result, the node will
return wrong results to client requests.
> {code}
> Datacenter: DC1
> ===============
> Status=Up/Down
> / State=Normal/Leaving/Joining/Moving
> – Address Load Tokens Owns Host ID Rack
> UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? e24656ac-c3b2-4117-b933-a5b06852c993
> UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 53da2104-b1b5-4fa5-a3dd-52c7557149f9
> UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? ef8311f0-f6b8-491c-904d-baa925cdd7c2
> {code}
> We are using GossipingPropertyFileSnitch.
> Thanks

This message was sent by Atlassian JIRA

View raw message