cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Knighton (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10111) reconnecting snitch can bypass cluster name check
Date Mon, 21 Dec 2015 21:18:46 GMT


Joel Knighton commented on CASSANDRA-10111:

Sounds good - my original understanding was that this would be okay, but it sounds like the
messaging service version change strategy is still unclear.

I think the best option is to wait until the next messaging service change. As you mentioned,
this is an unlikely situation that has a solution in the form of forcing removal of the entries
from gossip using nodetool.

> reconnecting snitch can bypass cluster name check
> -------------------------------------------------
>                 Key: CASSANDRA-10111
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Distributed Metadata
>            Reporter: Chris Burroughs
>            Assignee: Joel Knighton
>              Labels: gossip, messaging-service-bump-required
>             Fix For: 3.x
> Setup:
>  * Two clusters: A & B
>  * Both are two DC cluster
>  * Both use GossipingPropertyFileSnitch with different listen_address/broadcast_address
> A new node was added to cluster A with a broadcast_address of an existing node in cluster
B (due to an out of data DNS entry).  Cluster B  added all of the nodes from cluster A, somehow
bypassing the cluster name mismatch check for this nodes.  The first reference to cluster
A nodes in cluster B logs is when then were added:
> {noformat}
>  INFO [GossipStage:1] 2015-08-17 15:08:33,858 (line 983) Node /
is now part of the cluster
> {noformat}
> Cluster B nodes then tried to gossip to cluster A nodes, but cluster A kept them out
with 'ClusterName mismatch'.  Cluster B however tried to send to send reads/writes to cluster
A and general mayhem ensued.
> Obviously this is a Bad (TM) config that Should Not Be Done.  However, since the consequence
of crazy merged clusters are really bad (the reason there is the name mismatch check in the
first place) I think the hole is reasonable to plug.  I'm not sure exactly what the code path
is that skips the check in GossipDigestSynVerbHandler.

This message was sent by Atlassian JIRA

View raw message