cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-9401) Better check for gossip stabilization on startup
Date Fri, 15 May 2015 20:12:00 GMT


Brandon Williams updated CASSANDRA-9401:
    Attachment: 9401.txt

> Better check for gossip stabilization on startup
> ------------------------------------------------
>                 Key: CASSANDRA-9401
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Tyler Hobbs
>            Assignee: Brandon Williams
>             Fix For: 3.x
>         Attachments: 9401.txt
> In CASSANDRA-4288, we started checking the active and pending counts in the Gossip stage
on startup.  Once the active + pending counts are zero for three consecutive polls (1 second
apart), we consider gossip to have stabilized.
> There are a few of problems with this approach.  In large clusters, it may take a long
time for this to happen.  There was one report of it taking 10 minutes in a 1700 node cluster.
 The second problem is that the polling cycle could happen to align with the gossip cycle
(they both use a 1s period), resulting in active + pending being greater than zero for a long
time.  Additionally, if CASSANDRA-9206 is committed, seed nodes would receive a lot of gossip
traffic, making it very difficult for them to ever have no active or pending requests.
> A better approach would be to simply wait for the number of endpoint states to stabilize.

This message was sent by Atlassian JIRA

View raw message