cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-8336) Quarantine nodes after receiving the gossip shutdown message
Date Thu, 22 Jan 2015 21:28:35 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Brandon Williams updated CASSANDRA-8336:
----------------------------------------
    Attachment: 8336-v2.txt

This patch helps, but the problem with this is approach is the node can still flap, given
a disjoint enough (gossip state-wise) cluster.  There are a few ways we can solve this:

* quarantine after shutdown.  This has the consequence of not being able to restart a node
until the quarantine expires.

* Sleep for ring_delay or some interval after setting the shutdown state before sending the
rpc shutdown.  I'm not 100% sure this would prevent the flapping, and sleeping that long on
shutdown sucks as equally as not being able to reboot until the quarantine expires.

* Offline Richard suggested to me a third way, which I'll discuss below.

The method suggests when node X receives a shutdown event from Y, it will update its local
state for Y to version Integer.MAX_VALUE, and thus no updates for the same generation will
be accepted since they will always have a lower version.  When Y restarts it will have a new
generation and everything will work normally.  

There is one consequence to this method, and that is that gossipdisable/enable has to now
generate a new generation, which triggers the "has restarted, now UP" message on other nodes,
but this seems like a fairly minor thing.

On the surface, it may seem easier to have Y just send with a version of MAX_VALUE, but that
will only apply to nodes that receive it via gossip, not the ones that receive it via rpc
which is likely the bulk of them, and it wouldn't be an optimization anyway since we only
sleep for one gossip round, and the node(s) we gossip to will set the version anyway before
propagating it to the rest of the cluster.

v2 does this.

> Quarantine nodes after receiving the gossip shutdown message
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-8336
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8336
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>             Fix For: 2.0.13
>
>         Attachments: 8336-v2.txt, 8336.txt
>
>
> In CASSANDRA-3936 we added a gossip shutdown announcement.  The problem here is that
this isn't sufficient; you can still get TOEs and have to wait on the FD to figure things
out.  This happens due to gossip propagation time and variance; if node X shuts down and sends
the message to Y, but Z has a greater gossip version than Y for X and has not yet received
the message, it can initiate gossip with Y and thus mark X alive again.  I propose quarantining
to solve this, however I feel it should be a -D parameter you have to specify, so as not to
destroy current dev and test practices, since this will mean a node that shuts down will not
be able to restart until the quarantine expires.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message