cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Forsberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9183) Failure detector should detect and ignore local pauses
Date Thu, 21 May 2015 11:33:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554143#comment-14554143
] 

Erik Forsberg commented on CASSANDRA-9183:
------------------------------------------

This patch applies cleanly on 2.0 and has greatly increased my cluster stability. So if you
would consider inclusion into 2.0 that would be great.

> Failure detector should detect and ignore local pauses
> ------------------------------------------------------
>
>                 Key: CASSANDRA-9183
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9183
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>             Fix For: 2.2.0 beta 1, 2.1.6
>
>         Attachments: 9183-v2.txt, 9183.txt
>
>
> A local node can be paused for many reasons such as GC, and if the pause is long enough
when it recovers it will think all the other nodes are dead until it gossips, causing UAE
to be thrown to clients trying to use it as a coordinator.  Instead, the FD can track the
current time, and if the gap there becomes too large, skip marking the nodes down (reset the
FD data perhaps)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message