cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9183) Failure detector should detect and ignore local pauses
Date Wed, 20 May 2015 21:02:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553102#comment-14553102
] 

Brandon Williams commented on CASSANDRA-9183:
---------------------------------------------

wasPaused was added simply to survive two rounds of interpret() on the same endpoint, but
wasn't intended to cross endpoints at all.  That said, I think you're right and instead we'd
have to do something like track it per-endpoint.  Can you make a new ticket for this?

> Failure detector should detect and ignore local pauses
> ------------------------------------------------------
>
>                 Key: CASSANDRA-9183
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9183
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>             Fix For: 2.2.0 beta 1, 2.1.6
>
>         Attachments: 9183-v2.txt, 9183.txt
>
>
> A local node can be paused for many reasons such as GC, and if the pause is long enough
when it recovers it will think all the other nodes are dead until it gossips, causing UAE
to be thrown to clients trying to use it as a coordinator.  Instead, the FD can track the
current time, and if the gap there becomes too large, skip marking the nodes down (reset the
FD data perhaps)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message