I've done some gossip simulations in the past and foun= d virtually no difference in the time it takes for messages to propagate in= almost any sized cluster.=C2=A0 IIRC it always converges by 17 iterations.= =C2=A0 Thus, I completely agree with Jeff's comment here.=C2=A0 If you = aren't pushing 800-1000 nodes, it's not even worth bothering with.= =C2=A0 Just be sure you have seeds in each DC.=C2=A0=C2=A0

Something to be aware of - there's only a chance to gossip with a se= ed.=C2=A0 That chance goes down as cluster size increases, meaning seeds ha= ve less and less of an impact as the cluster grows.=C2=A0 Once you get to 1= 00+ nodes, a given node is very rarely talking to a seed.

Just make sure when you start a node it's not in its own seed l= ist and you're good.

On Tue, Jan 8, 2019 at 9:39 AM Jeff Jirsa <jjirsa@gmail.com> wrote:

On Tue, Jan= 8, 2019 at 8:19 AM Jonathan Ballet <jballet@edgelab.ch> wrote:
Hi Jeff,

thanks for answering = to most of my points!
From the reloadseeds' ticket, I followe= d to https://issues.apache.org/jira/browse/CASSANDRA-3829 which= was very instructive, although a bit old.

On Mon, 7 Jan 2019 at 17:23, Jeff= Jirsa <jjirsa@gma= il.com> wrote:
> On Jan 7, 2019, at 6:37 AM, Jonathan Ballet <jballet@edgelab.ch> wrote:
>
[...]

>=C2=A0 =C2=A0In essence, in my example that would be:
>
>=C2=A0 =C2=A0- decide that #2 and #3 will be the new seed nodes
>=C2=A0 =C2=A0- update all the configuration files of all the nodes to w= rite the IP addresses of #2 and #3
>=C2=A0 =C2=A0- DON'T restart any node - the new seed configuration = will be picked up only if the Cassandra process restarts
>
> * If I can manage to sort my Cassandra nodes by their age, could it be= a strategy to have the seeds set to the 2 oldest nodes in the cluster? (Th= is implies these nodes would change as the cluster's nodes get upgraded= /replaced).

You could do this, seems like a lot of headache for little benefit. Could b= e done with simple seed provider and config management (puppet/chef/ansible= ) laying=C2=A0 down new yaml or with your own seed provider

So, just to make it clear: sorting by age isn't a = goal in itself, it was just an example on how I could get a stable list.

Right now, we have a dedicated group of seed nodes += a dedicated group for non-seeds: doing rolling-upgrade of the nodes from t= he second list is relatively painless (although slow) whereas we are facing= the issues discussed in CASSANDRA-3829 for the first group which are non-s= eeds nodes are not bootstrapping automatically and we need to operate them = in a more careful way.

<= div>
Rolling upgrade shouldn't need to re-bootstrap. Only= replacing a host should need a new bootstrap. That should be a new host in= your list, so it seems like this should be fairly rare?=C2=A0
= =C2=A0
What I'= m really looking for is a way to simplify adding and removing nodes into ou= r (small) cluster: I can easily provide a small list of nodes from our clus= ter with our config management tool so that new nodes are discovering the r= est of the cluster, but the documentation seems to imply that seed nodes al= so have other functions and I'm not sure what problems we could face tr= ying to simplify this approach.

Ideally, what I wo= uld like to have would be:

* Considering a stable = cluster (no new nodes, no nodes leaving), the N seeds should be always the= same N nodes
* Adding new nodes should not change that list
* Stopping/removing one of these N nodes should "promote" a= nother (non-seed) node as a seed
=C2=A0 - that would not restart = the already running Cassandra nodes but would update their configuration fi= les.
=C2=A0 - if a node restart for whatever reason it would pick= up this new configuration

So: no node would start its li= fe as a seed, only a few already existing node would have this status. We w= ould not have to deal with the "a seed node doesn't bootstrap"= ; problem and it would make our operation process simpler.
=C2=A0=
> I also have so= me more general questions about seed nodes and how they work:
>
> * I understand that seed nodes are used when a node starts and needs t= o discover the rest of the cluster's nodes. Once the node has joined an= d the cluster is stable, are seed nodes still playing a role in day to day = operations?

They=E2=80=99re used probabilistically in gossip to encourage convergence. = Mostly useful in large clusters.

How &= quot;large" are we speaking here? How many nodes would it start to be = considered "large"?

=
~800-1000
=C2=A0
Also, about the convergence: is this related to how fast/often the= cluster topology is changing? (new nodes, leaving nodes, underlying IP add= resses changing, etc.)

=

New nodes, nodes going up/down= , and schema propagation.=C2=A0
=C2=A0