cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-6590) Gossip does not heal after a temporary partition at startup
Date Mon, 20 Jan 2014 05:38:20 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vijay updated CASSANDRA-6590:
-----------------------------

    Attachment: 0001-CASSANDRA-6590.patch

Nit: do_firewall_check is true by default in the yaml but is false in config.

Attached patch is on top of the original patch by brandon,
Sets the hibernate state (Dead state) as step 1 in joinTokenRing which will be later changed
at the end of the method to normal.
The main fix (IMHO) is in the OTCP where we timeout so we can reconnect, when the socket hangs
and makes the connection un-useable during temp network partition.

Please note: this patch changes the streaming_socket_timeout_in_ms configuration to socket_timeout_in_ms
and reuses them.

> Gossip does not heal after a temporary partition at startup
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-6590
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6590
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Vijay
>             Fix For: 2.0.5
>
>         Attachments: 0001-CASSANDRA-6590.patch, 6590_disable_echo.txt
>
>
> See CASSANDRA-6571 for background.  If a node is partitioned on startup when the echo
command is sent, but then the partition heals, the halves of the partition will never mark
each other up despite being able to communicate.  This stems from CASSANDRA-3533.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message