cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7306) Support "edge dcs" with more flexible gossip
Date Fri, 11 Dec 2015 07:49:11 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984783#comment-14984783
] 

Jeff Jirsa edited comment on CASSANDRA-7306 at 12/11/15 7:49 AM:
-----------------------------------------------------------------

I've implemented some of this, primarily for my own education (learning some of the internals
of gossip better). I've approached this by creating a pluggable IDatacenterTopologyProvider,
and implemented a full mesh, file-based whitelist, and file-based-blacklist provider. I then
extended Gossiper to filter it's list of live endpoitns by calling the IDatacenterTopologyProvider
instance to filter non-gossipable endpoints, which seems to fit the goal of this ticket. This
enables not only hub/spoke, but arbitrary graphs of database connectivity. 

However, the ticket is pretty poorly defined in terms of behaviors. 

[~tupshin] , This ticket title mentions "more flexible gossip" -  does this carry into requests/CL
as well? What's the desired/expected behavior if a KS uses NTS to have rf=3 in dcs a,b, and
c, but hosts in dc=b are set not to gossip with hosts in dc=c, and vice versa? CL=ALL fails,
CL=QUORUM fills with a+b, and writes just assume all nodes in c are down? Or should it be
smart enough to know that c is disconnected, and not count hosts in c towards quorum/ALL ?


-My primary hangup is finding the right way to notify the KS replication strategy to reload
if the list of of whitelisted/blacklisted DCs changes. I know it's a solvable problem, but
if it's out of scope, I won't waste time with it. I realize that this is a {{ponies}} ticket,
and there's a ton of bike-shed/ponies opportunity here, but if we can get some consensus on
definition, I can try to get this to a point where it can potentially be ready for real review-


was (Author: jjirsa):
I've implemented some of this, primarily for my own education (learning some of the internals
of gossip better). I've approached this by creating a pluggable IDatacenterTopologyProvider,
and implemented a full mesh, file-based whitelist, and file-based-blacklist provider. I then
extended Gossiper to filter it's list of live endpoitns by calling the IDatacenterTopologyProvider
instance to filter non-gossipable endpoints, which seems to fit the goal of this ticket. This
enables not only hub/spoke, but arbitrary graphs of database connectivity. 

However, the ticket is pretty poorly defined in terms of behaviors. 

[~tupshin] , This ticket title mentions "more flexible gossip" -  does this carry into requests/CL
as well? What's the desired/expected behavior if a KS uses NTS to have rf=3 in dcs a,b, and
c, but hosts in dc=b are set not to gossip with hosts in dc=c, and vice versa? CL=ALL fails,
CL=QUORUM fills with a+b, and writes just assume all nodes in c are down? Or should it be
smart enough to know that c is disconnected, and not count hosts in c towards quorum/ALL ?


My primary hangup is finding the right way to notify the KS replication strategy to reload
if the list of of whitelisted/blacklisted DCs changes. I know it's a solvable problem, but
if it's out of scope, I won't waste time with it. I realize that this is a {{ponies}} ticket,
and there's a ton of bike-shed/ponies opportunity here, but if we can get some consensus on
definition, I can try to get this to a point where it can potentially be ready for real review.


> Support "edge dcs" with more flexible gossip
> --------------------------------------------
>
>                 Key: CASSANDRA-7306
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Tupshin Harper
>              Labels: ponies
>
> As Cassandra clusters get bigger and bigger, and their topology becomes more complex,
there is more and more need for a notion of "hub" and "spoke" datacenters.
> One of the big obstacles to supporting hundreds (or thousands) of remote dcs, is the
assumption that all dcs need to talk to each other (and be connected all the time).
> This ticket is a vague placeholder with the goals of achieving:
> 1) better behavioral support for occasionally disconnected datacenters
> 2) explicit support for custom dc to dc routing. A simple approach would be an optional
per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message